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not  narrowed  significantly.  In  this  study,  we  apply-  techniques  borrowed  from 
contemporary  research  in  abstract  data  type  specification  to  design,  specify  and 
implement  the  physical  resources  of  an  abstract  machine  called  AM. 
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I.  INTRODUCTION 


We  address  the  problem  of  formalizing  the  relationship  between  hardware 
and  software  resources  by  demonstrating  a  practical  methodology  for  precisely 
specifying  the  manner  in  which  the  two  may  interact.  After  a  brief  statement  of 
the  problem  and  related  topics,  we  discuss  the  theory  behind  our  research  and 
some  of  the  issues  affecting  our  results.  Finally,  we  describe  in  some  detail  the 
manifestation  of  our  efforts,  a  specification  and  implementation  of  an  abstract 
machines  we  call  AM. 

A.  THE  PORTABILITY  PROBLEM 

It  is  well  known  that  porting  large  programs  from  one  machine  to  another  is 
an  expensive  ordeal.  It  is  also  well  known  that  once  the  software  has  been  moved 
to  the  new  machine,  it  is  anybody's  guess  whether  or  not  it  will  work  els  before1. 
Even  if  our  program  seems  to  work,  we  may  find  it  consumes  more  resources  than 
we  expected.  Indeed,  this  may  be  just  as  bad  as  if  it  did  not  work  at  all. 

There  are  a  number  of  reasons  why  the  portability  problem  is  getting  worse, 
not  better: 

-  Most  architectures,  even  those  which  profess  to  be  "language  directed", 
reflect  a  bias  toward  making  the  machine  look  like  what  the  programmer 
wants,  or  toward  some  engineering  goal,  such  as  maximizing  the  number  of 
devices. 

-  Both  languages  and  machines  are  related  to  the  data  they  manipulate  in  an 
implementation  dependent  way. 

-  Language  and  hardware  designers  pursue  their  conflicting  goals  to  the 
detriment  of  the  poor  compiler  writer,  who,  with  imprecise  tools  and 
methodologies  is  faced  with  the  job  of  implementing  ambiguous  semantics  on 
an  informally  designed  resource. 

Although  these  and  other  factors  do  adversely  contribute  to  the  imperfect  task  of 
moving  software  from  one  machine  to  another,  they  add  their  weight  to  other 
difficult  issues  in  language  design,  computer  architecture  and  software 


'We  assume,  probably  unjustifiably,  that  it  worked  correctly  before  we  tried  to  move  it. 


engineering.  This  study  confines  itself  to  treating  the  issues  surrounding  the 
interaction  between  the  programmer's  view  of  the  world  as  a  problem,  and  the 
architect's  view  of  the  world  as  a  resource. 


1.  Abstraction 

Abstraction  describes  the  separation  of  the  defining  properties  of  an 
object  from  other,  unnecessary  details  about  it.  A  programmer  is  primarily 
concerned  with  solving  a  problem.  Appropriately,  the  tools  at  his  disposal, 
programming  languages,  development  aids,  the  programming  environment,  form 
a  problem  solving  abstraction.  The  hardware  (and  some  of  the  software)  on  which 
this  problem  solving  abstraction  is  implemented,  however,  is  an  abstraction  of  a 
different  sort.  Addresses,  registers,  ports,  most  of  the  operating  system  service 
routines,  all  provide  more  or  less  efficient  ways  to  manipulate  the  physical 
resources  of  the  machine  —  they  form  a  physical  resource  abstraction. 

The  fuzzy  area  between  these  two  abstractions,  sometimes  simplistically 
perceived  as  the  boundary  between  hardware  and  software,  exposes  a  number  of 
shortcomings  in  language  design  and  computer  architecture  collectively  termed 
the  semantic  gap. 

2.  The  Semantic  Gap 

The  semantic  gap  manifests  itself  anywhere  a  problem  solving  abstraction 
touches  a  physical  resource  abstraction.  A  detailed  description  may  be  found  in 
Myers  (1982).  He  observes  that  the  semantic  gap  contributes  to  the  cost  of 
software  development,  software  unreliability,  inefficiency,  complexity,  and  the 
distortion  of  programming  languages.  Certainly  no  single  development  or 
methodology  will  eliminate  this  problem. 

Narrowing  the  semantic  gap  requires  significant  changes  in  the 
fundaments  of  computer  architecture  and  language  design.  We  chose  to 
concentrate  on  three  factors  which  significantly  contribute  to  this  problem: 

-  Informally  described  semantics. 

-  Representation  dependent  data  types. 

-  Arbitrarily  designed  instruction  set  architectures. 

The  implication,  of  course,  is  that  through  increased  formalism,  the  introduction 
of  representation  independent  data,  and  a  more  thoughtful  treatment  of  the 
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instruction  set,  the  semantic  gap  can  be  narrowed.  The  balance  of  this  thesis  is 
devoted  to  describing  a  methodology  for  doing  just  that. 

B.  THREE  WAYS  TO  NARROW  THE  SEMANTIC  GAP 

1.  Formalism 

The  benefits  of  formalism  in  the  design  process  have  been  amply  revealed 
in  countless  articles  treating  this  issue  from  the  standpoint  of  software 
engineering.  Our  concern  will  be  limited  to  formalism  as  it  applies  to  the 
specification  of  an  abstraction.  Various  specification  methodologies  exist,  many 
of  which  have  been  used  with  more  or  less  success  in  projects  of  practical 
significance.  But  we  caution  the  reader  that  by  "formal"  we  mean  a 
mathematical  rigour  rooted  in  proven  theory.  The  idea  of  formalism  as  often 
applied  to  software  engineering  will  not  do  here.  A  formal  specification  is  a 
complete  description  of  the  meaning  of  an  object.  It  forms  the  basis  for  an 
abstraction  and  is  ultimately  a  bridge  over  the  semantic  gap. 

The  benefits  of  formalism  in  which  we  are  most  interested  are: 

-  It  provides  a  firm  basis  for  proving  our  assertions  about  a  specification  and 
its  implementation. 

-  It  encourages  a  discipline  on  the  part  of  the  designer  to  be  rigorously  precise. 

-  It  compels  us  to  find  ways  of  describing  things  which  are  representation 
(implementation)  independent. 

2.  Representation  Independence 

Conventional  machines  force  us,  as  programmers,  to  develop  our  own 
abstractions  of  data.  At  a  time  when  we  are  most  concerned  with  developing 
clean  algorithms  the  architecture  obligates  us  to  worry  about  status  registers  and 
word  length.  Certainly  someone  must  ultimately  deal  with  these  physical 
properties  of  the  hardware,  but  this  should  not  fall  as  an  obligation  upon 
programmer.  The  programmer  should  be  free  to  ignore  unnecessary  detail. 

We  will  attempt  to  minimize  the  dependence  of  data  upon  its 
representation  through  the  use  of  abstract  data  types.  Our  notion  of  data  is  very 
general  and  includes,  for  example,  program  instructions. 


3.  Intent  Expressive  Resource  Abstraction 

Conventional  architectures  do  not  permit  us  to  unambiguously  express 
our  intent  in  a  program.  Artificial  data  types  combined  with  typical  resource 
models  force  ambiguity  and  the  overloading  of  data  structures.  Stack  frames  are 
a  good  example  of  this.  The  semantics  of  the  frame  combine  those  of  an  array 
and  those  of  a  stack.  Meanwhile,  the  whole  thing  is  implemented  in  memory, 
with  the  data  types  overlayed  on  an  array  of  fixed  length  cells. 

We  claim  that  applying  methods  similar  to  those  used  to  describe 
abstract  data  types,  we  can  describe  an  abstraction  of  the  physical  resources  of  a 
machine  which  benefits  not  only  from  the  formalism  used  to  specify  it,  but  also 
permits  the  implementor  to  clearly  interpret  the  intent  of  programs  written  for  it. 

C.  METHODOLOGY 

The  goal  of  this  research  is  to  contribute  something  of  practical  significance 
to  the  study  of  software  portability  by  treating  an  area  which  has  been  largely 
ignored  —  the  design  of  a  formal  abstraction  for  the  machine  itself.  We  have 
innumerable  high  level  programming  languages,  programming  environments, 
graphics  languages,  database  machines,  file  systems,  operating  system  command 
interpreters,  a  whole  host  of  different  abstractions  tailored  to  the  task  of 
providing  us  with  just  enough  information  to  do  everything  we  need  to  do,  and 
nothing  more.  So  why,  then,  have  we  failed  to  develop  abstractions  for  the 
hardware  resources,  upon  which  we  are  so  dependent,  which  are  more  than  just  a 
collection  of  registers,  opcodes  and  some  arbitrary'  rules  about  how  they'  interact. 
A  more  difficult  but  certainly  more  important  task  than  actually  defining  the 
abstraction  is  developing  a  methodology  for  producing  more. 

Our  method  has  been  to  take  a  naive  approach  toward  all  areas  of  the  design 
and  implementaion  process  not  directly  related  to  the  specification  itself.  We  do 
this  for  two  reasons.  First,  we  can  take  for  granted  the  large  body  of  research  in 
programming  languages  and  computer  architecture  —  we  are  designing  neither  a 
language  nor  a  processor,  even  though  ad  hoc  examples  were  required  to  complete 
the  implementation.  Second,  the  research  is  intended  to  benefit  programmers. 
Since  it  is  unreasonable  to  expect  those  who  may  use  this  method  to  understand 
the  theory  behind  the  specification,  the  key  to  understanding  the  reasons  for  our 
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design  decisions  lies  in  the  way  we  coded  it.  Thus,  cleverness  has  been  eschewed 
in  favor  of  clarity. 

Our  task  in  this  thesis,  then,  is  to  examine  a  wide  range  of  issues  which 
impinge  on  the  process  of  designing  and  implementing  the  specification  of  a 
machine,  and  then  to  describe  how  we  wrent  about  actually  doing  it. 
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Another  area  where  we  run  into  trouble  is  the  number  of  terms  induced 
by  the  specification.  If  the  number  of  terms  is  infinite,  then  the  specification  is 
not  even  representable,  let  alone  computable.  This  problem  can  easily  creep  into 
a  specification  unnoticed. 

5.  Parameters 

To  reduce  the  enormous  complexity  of  some  types  of  specifications, 
designers  have  turned  to  the  use  of  parameterized  specifications.  The  basic  idea 
is  to  w-rite  a  specification  whose  signature  is  a  template  for  other  specifications. 
A  typical  example  would  be  the  type  string.  We  might  want  strings  of  other 
objects  besides  characters.  Rather  than  duplicate  what  is  essentially  the  same 
specification,  it  is  parameterized  and  instantiated  when  needed  in  the 
specification.  The  need  for  parameterized  specifications  goes  beyond  a  simple 
savings  in  the  effort  of  writing  a  specification.  Some  objects  are  simply  not 
describable  without  them. 

Parameterized  specifications  are  still  not  well  understood.  Most  of  the 
underlying  theory  concerning  them  is  under  debate.  We  have  minimized  our  use 
of  them,  as  a  result. 

6.  Errors 

What  to  do  when  objects  which  are  not  members  of  the  right  carrier  sets 
find  their  way  into  operations  is  a  real  problem,  as  is  the  "propagation"  of  errors 
throughout  the  specification.  A  number  of  ways  of  handling  this  have  been 
proposed,  most  notably  by  Goguen  (1978),  but  without  much  success.  We  have 
found  that  in  implementation  this  is  not  a  problem,  however,  and  have  generally 
taken  the  point  of  view  that  the  most  important  thing  to  specify  is  u-here  errors 
are  explicitly  detected  rather  than  what  to  do  about  them  once  they  are. 

7.  Proving  Correctness 

We  have  not  mentioned  provability  except  to  say  that  a  formal  design 
methodology  tends  to  support  formal  methods  of  proof.  It  is  beyond  the  scope  of 
this  study  to  treat  this  issue  in  detail,  but  we  will  return  to  it  in  the  discussion  of 
our  implementation. 


Unfortunately,  there  are  many  accepted  features  of  conventional  processors  which 
are  extemely  difficult  if  not  impossible  to  describe  with  algebraic  specifications. 
Take,  for  example,  the  typical  primitive  of  all  machine  types  —  the  bit.  The 
reader  is  encouraged  to  attempt  to  write  a  semantic  description  of  two's 
complement  arithmetic,  or  operation  of  a  status  register.  True,  one  of  the  goals 
we  have  stated  is  representation  independence,  But  for  this  we  give  up  the 
freedom  to  design  anything  we  might  conceive. 

Another  important  limitation  is  the  difficulty  with  modality.  How  does 
one  specify  when  an  operation  is  to  occur.  In  operators  whose  arity  is  greater 
than  one,  the  arguments  are  assumed  to  "arrive"  simultaneously,  and  side  effects 
are  not  allowed.  A  number  of  techniques  have  been  suggested  as  to  how  timing 
might  be  formally  expressed  (Giegerich  1983)  but  we  use  only  the  simplest. 
Modality  is  expressed  in  terms  of  parametric  dependencies. 

prog(a,q)  =  xeq(atomofinstr(fetchm(a,q),a,q)); 

In  the  above  example,  the  operator  fetchm  is  applied  "before"  atomofinstr, 
which  is  applied  before  xeq. 

4.  Finiteness 

No  matter  what  we  describe  with  a  formal  specification,  any 
implementation  of  the  specification  must  be  finite.  Hence  a  problem:  how  do  we 
describe  a  finite  limitation  of  an  infinite  set  of  objects.  Consider  again  Figure  3.1. 
It  specifies  a  type  with  a  countable  infinity  of  objects.  But  we  have  no  machines 
to  represent  an  infinity  of  numbers.  The  problem  does  not  stop  there.  A 
specification  for  the  natural  data  type,  which  will  look  much  the  same  as  the  one 
for  integer,  will  also  describe  a  countable  infinity  of  objects.  In  a  world 
accustomed  to  twice  as  many  signed  as  unsigned  integers,  this  will  come  as  a 
great  shock! 

Obviously,  any  actual  machine  will  be  finite.  And  although  the  problem 
we  have  just  described  may  seem  more  metaphysical  than  physical,  it  forces  us  to 
realize  that  w-e  will  never  be  able  to  fully  implement  a  specification.  More 
important,  it  also  requires  us  to  deal  with  boundary  conditions  in  ways  which 
may  not  preserve  our  methodology. 
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A.  FIRST  PRINCIPLES 


Any  formal  specification  must  begin  with  some  assumptions,  "first 
principles",  upon  which  the  whole  methodology  is  based.  We  have  already 
discussed  the  mathematical  basis  for  our  specification  method.  Let  us  look  now 
at  its  implications. 

1.  Assumptions 

At  some  point  we  will  have  to  describe  the  operations  our  abstract 
machine  will  perform.  Some  of  these  operations  can  be  defined  explicitly.  Most, 
however,  will  be  defined  in  terms  of  certain  primitives,  the  meanings  of  which  we 
simply  take  for  granted.  For  example,  it  is  probably  a  good  idea  for  AM  to  be 
able  to  perform  integer  arithmetic.  So,  we  specify  the  integer  data  type.  Figure 
3.1  gives  us  just  about  everything  we  need,  except  for  one  thing.  The 
specification  does  not  describe,  nor  should  it  describe,  the  way  in  which  strings 
from  an  alphabet  may  be  uniquely  associated  to  the  elements  of  the  type.  As 
written,  the  specification  obligates  us  to  refer  to  the  "number"  5  as 

succint(succint(succint(succint(succint(zeroint))))) 

Not  very  convenient. 

Thus,  we  consciously  limit  the  scope  of  our  formalism  in  the  interest  of 
practicality.  Our  use  of  a  formal  specification  is  intended  to  improve  our 
understanding  of  the  meaning  of  a  physical  resource,  not  elementary  number 
theory. 

2.  Notation  and  Syntax 

Although  the  notation  ultimately  used  to  express  the  specification  is 
arbitrary,  the  need  for  automatic  parsing  means  the  usual  syntactic  and  semantic 
considerations  facing  the  designers  of  any  programming  language  are  before  us  as 
well.  In  addition,  since,  as  we  will  show,  the  specification  of  anything  useful  is 
likely  to  be  complex,  we  may  also  need  to  choose  notation  which  supports 
automatic  program  generation,  or  at  least  macro  processing. 

3.  The  Limitations  of  Algebraic  Specifications 

Once  we  are  commited  to  a  formal  design  methodology,  it  will  be  difficult 
to  justify  departures  from  the  method.  This  means  we  have  limited  ourselves  to 
designing  objects  which  can  be  describe  with  the  semantics  we  have  defined. 


III.  ISSUES 


The  elegance  with  which  algebraic  specifications  solve  the  mechanical 
problem  of  specifying  an  abstraction  does  nothing  to  help  solve  a  whole  host  of 
other  problems  affecting  our  design.  In  fact,  the  use  of  a  formal  methodology  has 
imposed  constraints  which  would  not  normally  affect  more  conventional 
architectures.  We  treat  these  and  other  issues  now. 


spec  integer 
is 

extend 

boolean 

with 

sort 

int; 

op 

predint:  int  -  int; 
succint:  int  ->  int; 
sumint:  int, int  -  int; 
zeroint:  -  int; 
eqint:  int, int  -  bool; 
gtint:  int, int  -  bool; 

axiom 

predint(succint  n)  =  n; 
comutative(  sumint,  int); 
associative(sumint  ,int) ; 
sumint(n, zeroint)  =  n; 

sumint(n, succint  m)  =  succint(sumint(n,m)); 

equivrel(eqint.int): 

irreflexive(gtint  ,int) : 

transitive(gtint.int) ; 

end  extend: 
end  integer; 

Figure  3.1:  A  Spec  for  the  Abstract  Type  Integer 
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We  can  prove  a  Turing  machine  is  describable  by  these  algebras.  Therefore, 
we  can  describe  a  computer  with  these  algebras. 


final  quotient  algebra,  then  we  say  the  candidate  describes  the  "final  algebra 
semantics"  of  the  specification.  If  the  congruence  matches  neither  the  initial  nor 
the  final  quotient  algebra,  then  the  candidate  does  not  describe  the  meaning  of 
the  specification. 

Now  that  we  know  how  to  determine  whether  or  not  our  specification 
describes  something  "real",  the  next  step  is  to  show  that  the  object(s)  it  describes 
has  the  properties  we  intended.  Rather  than  discuss  the  details,  we  direct  the 
reader  again  to  Goguen  (1978)  and  Davis  (1984),  and  instead  list  some  of  the  key 
results  of  this  theory  below: 

-  A  specification  is  an  abstraction  of  a  concrete  object.  It  forms  a  template  for 
an  algebra  which  must  ultimately  describe  the  meaning  of  the  specification. 

-  A  term  algebra  contains  only  those  terms  composed  of  operators  without  free 
variables.  A  free  algebra  permits  terms  with  free  variables. 

-  The  axioms  of  a  specification  are  really  equations  between  terms  in  a  free 
algebra. 

-  The  initial  algebra  semantics  of  a  specification  is  defined  by  the  class  of 
algebras  whose  signature  is  given  in  the  specification,  with  the  property  that 
any  two  formal  terms  are  provably  equal  from  the  axioms  if  and  only  if  the 
corresponding  expressions  in  the  algebra  of  that  class  evaluate  to  the  same 
constant. 

-  The  final  algebra  semantics  of  a  specification  is  defined  by  the  class  of 
algebras  whose  signature  is  given  in  the  specification,  with  the  property  that 
any  two  formal  terms  are  consistent  with  the  axioms  if  and  only  if  the 
corresponding  expressions  evaluate  to  the  same  constant. 

-  Any  two  final  or  any  two  initial  algebras  for  a  specification  are  isomorphic. 

-  The  object  a  specification  specifies  is  computable  if  and  only  if  the  class  of 
initial  algebras  and  the  class  of  final  algebras  are  the  same.  Likewise,  any 
time  one  can  show  all  formal  terms  reduce  to  a  0-ary  term  (a  constant),  then 
the  initial  and  final  algebraic  semantics  must  be  the  same. 

-  An  algebra  is  effectively  computable  when  its  signature  matches  that  of  the 
specification,  its  carrier  sets  are  enumerable,  and  the  operations  defined  by 
its  operators  can  be  described  using  algorithms. 

-  Any  time  one  can  show  the  class  of  initial  and  final  algebras  is  not  the  same, 
there  exists  at  least  one  algebra  which  is  not  effectively  computable. 

-  Any  two  specifications  which  can  be  shown  to  produce  the  same  class  of 
algebras  are  equivalent  (semantically). 

One  final  result  of  this  theory  forms  a  key  part  of  the  foundation  for 
believing  it  possible  to  describe  an  abstract  machine  with  algebraic  specifications: 
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The  second  congruence  is  such  that  two  terms,  t  and  t  '  are  congruent  if  and 
only  if  the  equation  t  =  t  '  is  consistent  with  the  axioms.  An  equation  is 
consistent  if.  by  adding  it  to  the  set  of  axioms,  the  resulting  set  of  terms  is  not 
trivial.  A  set  of  terms  is  considered  trivial  when,  for  all  terms,  any  two  terms  of 
the  same  sort  are  provably  equal. 

A  correspondence  H  associating  the  carrier  sets  and  operators  of  one  algebra 
A  to  the  carrier  sets  and  operators  of  another  algebra  B  can  be  shown  to  be  a 
homomorphism,  provided: 

-  The  correspondence  between  carrier  sets  preserves  the  sort  type. 

-  The  correspondence  between  operators  preserves  the  characteristic. 

-  The  standard  property  of  homomorphisms  holds: 

H  (o  (tutt,...,tn))=  H  [o  )(H  (tx),H  (t2).  ...//(<„)) 

where  each  t,  is  an  element  of  sort  »,  in  A  . 

There  is  a  canonical  homomorphism  from  a  term  algebra  to  an  algebra  A  ,  with 
the  same  signature,  which  can  be  determined  by  evaluating  each  formal  term  in 
the  term  algebra  through  its  corresponding  term  in  A  . 

E.  ALGEBRAIC  SEMANTICS 

We  learn  to  ascribe  meaning  to  abstraction  by  associating  with  that 
abstraction  concrete  objects.  In  some  sense,  we  "know"  the  meaning  of  concepts 
like  table  and  tree  when  we  can  recognize  the  class  of  objects  captured  by  those 
concepts2.  In  the  "world"  of  algebraic  specifications,  the  concrete  objects  are  the 
algebras.  They  form  the  manifestation  of  the  meaning  of  a  specification.  As  we 
have  said,  a  specification  induces  three  congruences:  a  congruence  defined  by  the 
algebra  on  the  evaluated  terms,  a  congruence  on  the  initial  quotient  algebra  and 
a  congruence  on  the  final  quotient  algebra.  We  can  determine  whether  or  not  a 
candidate  algebra  captures  the  meaning  of  a  specification  by  examining  the 
properties  of  the  congruence  it  induces.  If  the  congruence  is  identical  with  the 
initial  quotient  algebra,  then  we  say  the  candidate  describes  the  "initial  algebra 
semantics"  of  the  specification.  Likewise,  if  the  congruence  is  identical  with  the 

2  The  author  would  be  happy  to  discuss  the  epislemic  implications  of  this  statement  with  the 
philosophically  minded  reader  some  other  time. 


the  the  set  of  terms  forms  a  language  on  the  alphabet  obeying  the  following 
grammar: 

-  For  each  sort  <  in  S  add  the  production 

<  T5  >  — *  <  T5  > 

where  Ts  is  the  set  of  all  terms  which  can  be  created  from  the  signature 
which  contain  no  free  variables  and  Ts  is  the  set  of  all  terms  in  Ts  of  sort  « . 
Note  that  T,  and  Ts_  are  both  term  algebras. 

-  For  each  operator  of  characteristic 

add  the  production 

<TS>-  op( '  <  Ts  >  ', '  <TS  >  ')' 

*  *  1  » 

where  'op '  is  a  name  uniquely  associated  to  o . 

-  For  each  free  variable  X  of  sort  t ,  add  the  production 

<TS>  -  'X' 

The  reader  will  note  that  the  grammar  just  described  is  LL(l),  and  thus  can  be 
parsed  very  efficiently,  particularly  by  automatically  generated  parsers.  This  is 
the  theoretical  basis  for  the  methodology  described  in  Guttag  (1978a). 

Now,  the  set  of  axioms  induces  two  canonical  congruences  on  Ts,  which  in 
turn  induce  two  quotient  algebras  on  Ts ,  called  the  initial  quotient  algebra  and 
the  final  quotient  algebra.  The  first  congruence  is  such  that  two  terms,  t  and  t  ' 
are  congruent  if  and  only  if  the  assertion  t  =  t  '  can  be  proven  from  the  axioms. 
The  following  rules  apply  (Davis  1984): 

-  Any  axiom  is  a  proven  equation.  Any  conditional  axiom  is  a  valid  rule  of 
inference  for  proving  equations  from  proven  equations. 

-  If,  in  a  proven  equation,  every  occurence  of  a  free  variable  is  replaced  by  a 
single  term  of  the  same  sort,  the  resulting  equation  is  proven. 

-  If,  in  an  equation,  some  term  is  replaced  by  a  term  which  is  provably 
equivalent,  the  resulting  equation  is  proven. 

-  Any  equation  derived  from  proven  equations  using  the  reflexive,  symmetric 
or  transitive  laws  for  equality  is  proven. 

From  these  it  can  be  shown  that  the  relation  defined  by  all  pairs  of  provably 
equal  terms  is  a  congruence. 
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boolean  type,  i.e.,the  set  {t.f}.  and  if.  using  conventional  notation,  we  associate 
the  operators  in  the  specification  for  bool  with  some  representative  operators 

true  t 
false  f 
and  & 
not 

then  the  equations 

'  (&(x,&(t,y))  =  '  (&(x,y)) 

<M*  xi^(y>f))  = f 

both  satisfy  the  axioms  of  the  specification.  Important  to  note,  however,  is  that 
the  symbols  we  choose  to  associate  with  the  operators  in  a  specification  are 
completely  arbitrary.  If  we  instead  make  the  following  associations 


bool 

{a} 

true 

-a 

false 

—a 

(constant  ops  returning  ’a’) 

and 

(x,y)-a 

(trivial  binary  op) 

not 

(xj—a 

(trivial  unary  op) 

it  can  be  easily  shown  that  this  algebra  also  satisfies  the  axioms,  but  we  would 
not  admit  that  it  is  representative  of  the  boolean  type.  Thus,  we  make  a 
distinction  between  an  algebraic  specification  and  an  algebra.  What  then  is  the 
meaning  of  a  specification?  It  is  the  class  of  algebras  which  is  uniquely  associated 
to  that  specification.  The  precise  nature  of  this  association  will  be  discussed  in 
Section  E. 

D.  TERM,  INITIAL  AND  FINAL  ALGEBRAS 

Given  a  specification  (S,E),  with  signature  S  and  axioms  E,  our  next  problem 
is  to  show  that  there  are  indeed  algebras  with  that  signature  which  satisfy  the 
axioms.  Using  the  specification  for  boolean  as  an  example,  the  term  algebra  is  the 
set  of  all  term  expressions  which  can  be  constructed  without  violating  the 
characteristic  of  an  operator  in  the  specification.  This  set  of  terms  is  obtained 
using  a  technique  know  as  the  Herbrand  construction. 

If  we  view  terms  as  strings  on  an  alphabet  of  operator  names,  some  useful 
punctuation  symbols  and  a  finite  set  of  symbols  for  the  names  of  free  variables. 
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any  others  we  might  need.  Not  and  and  are  those  tranditionallv  encountered  in 
computer  hardware.  Thus,  so  far  w'e  have: 


sort 

bool 

ops 

true:  —  bool 
false:  —  bool 
not:  bool  ->  bool 
and:  bool, bool  ->  bool 

Now  we  need  some  axioms  to  describe  the  semantics  of  the  operators  on  values  of 
type  bool. 


not(true)  =  false 

not(not(b))  =  b 

and(true,b)  =  b 

and(false,b)  =  false 

and(bl,b2)  =  and(b2,bl) 

and(and(bl,b2),b3)  =  and(bl,and(b2,b3)) 

These  axioms  explicitly  describe  all  the  essential  properties  of  the  operators. 
Notice  that  no  explicit  mention  is  made  of  the  possible  composition  of  a  set  of 
boolean  values.  We  know  true  and  false  must  be  in  it,  but  they  may  not  be 
unique. 

The  syntax  used  in  Figure  2.1  was  chosen  to  permit  automatic  compilation. 
To  date,  no  such  compiler  has  been  written,  but  a  syntax  directed  editor 
operating  on  a  similar  syntax  is  available  (Lilly  1984). 

Algebraic  specifications  are  composed  of  a  signature,  which  includes  the 
operators  and  sorts,  and  axioms.  Axioms  are  simply  equations  between  terms 
(expressions)  made  up  of  operators  and/or  free  variables  from  the  specification. 
Axioms  may  be  conditional.  A  specification  is  thus  a  pair  (S.E)  w'here  S  is  the 
signature  and  E  is  a  set  of  axioms. 

An  algebra's  signature  matches  the  specification  if  there  is  a  one  to  one 
correspondence  between  the  sorts  and  operators  of  the  specification  and  the 
carriers  and  operators  of  the  algebra,  and  if  the  operations  on  elements  of  the 
carriers  are  consistent  with  the  semantics  specified  by  the  axioms  in  the 
specification.  If  we  associate  the  name  bool  to  what  we  normally  accept  as  the 


and  result  are  known  as  the  characteristic  of  the  operator.  A  sort  is  an  index 
into  a  set  of  carriers.  The  carrier  indexed  by  the  sort  represents  the  data  type, 
while  the  sort  represents  the  "name"  of  that  type,  like  "integer",  "boolean"  or 
"character".  Each  operator  is  thus  an  explicitly  defined  function  which  accepts 
zero  or  more  typed  arguments  and  returns  a  typed  result.  The  type  of  each 
argument  may  differ  from  every  other  argument  as  well  as  from  the  result  type. 

Any  usefully  complex  data  type  requires  the  use  of  free  variables.  Their 
presence  introduces  the  possibility  that  the  type  we  specify  may  not  by  finitely 
describable.  Note,  "finitely  describable"  does  not  mean  "descibes  a  finite  number 
of  objects".  It  means  the  description  is  itself  finite,  i.e.,  the  number  of  operators 
and  the  number  of  axioms  is  finite.  It  is  sometimes  difficult  to  find  a  finite 
description  of  a  type,  since  many  mathematical  and  logical  operations  assume  a 
non-fmite  application.  Simple  iteration  is  an  example  of  this. 

Since  we  hope  to  describe  something  in  a  way  which  is  representation 
independent,  we  must  be  certain  we  introduce  no  representational  bias  into  the 
specification.  If  we  ultimately  hope  to  show  how  to  use  these  methods  to 
describe  hardware  in  a  representation  independent  way,  we  certainly  do  not  want 
to  require  the  use  of  a  particular  architecture  unless  we  can  demonstrate  its 
generality. 

C.  ALGEBRAIC  SPECIFICATIONS 

A  specification  is  a  template  for  the  sets  and  operators  in  the  algebra.  The 
semantics  of  the  type  are  specified  using  axioms,  which  are  provable  equations 
constructed  from  operators  and  free  variables.  The  template  makes  no 
assumptions  about  the  elements  of  the  sets  in  the  algebra,  or  about  how 
operators  are  applied  to  manipulate  the  elements.  This  information  is  furnished 
by  the  axioms.  Let  us  now  return  to  Figure  2.1. 

In  the  case  of  the  specification  for  boolean,  we  require  a  single  set  to  hold  the 
values  of  the  type.  Call  it  bool.  Next,  we  specify  two  constants1  to  represent 
the  two  possible  values  of  any  object  in  the  type.  Call  them  true  and  false. 
Then  we  select  the  smallest  set  of  operators  from  which  we  know  we  can  derive 


Really  0-ary  operators. 
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This  leads  us  to  a  number  of  important  facts  about  abstract  data  types: 

-  The  meaning  of  an  abstract  type  specification  does  not  depend  upon  the 
notation  used  to  express  it. 

-  A  type  may  imply  other  operators  than  those  mentioned  explicitly  in  the 
specification. 

-  Two  "equivalent"  data  types  may  not  only  "look"  different,  but  may  be 
specified  using  fundamentally  different  operators  and  axioms. 

Proving  two  abstract  types  to  be  equivalent  is  not  trivial.  To  do  this  we  must  be 
able  to  show  that  the  set  of  possible  expressions  built  using  the  operators  from 
one  specification  is  in  some  way  "equivalent"  to  the  set  produced  with  operators 
from  the  other. 

An  expression  is  constructed  from  the  operators  and  values  of  the  type  using 
composition  of  operators.  Here,  using  prefix  notation,  are  two  correctly  formed 
boolean  expressions: 

not  ( and  ( t  rue , false) ) 

and  (true, and(false, not  (true))) 

Note,  they  do  not  contain  variables,  but  are  constructed  solely  from  operators. 
Below  is  an  expression  with  free  variables : 

and(not(x),or(true,y)) 

Evaluating  an  expression  is  straightforward,  but  the  presence  of  free  variables 
makes  it  much  more  difficult  to  prove  assertions  about  the  specification  of  the 
type. 

The  notation  we  used  in  Figure  2.1  is  not  arbitrary.  It  reflects  the 
fundamental  theory  upon  which  our  method  of  specification  is  based. 

B.  ALGEBRAS 

An  algebra  is  an  aggregate  of  operators  and  sets.  The  sets  describe  argument 
and  result  types  for  each  operator,  while  the  operators  define  the  ways  in  which 
arguments  may  be  manipulated  to  form  results.  The  general  form  of  an  operator 
is: 

o  :  a i,a2,a — an 

where  each  o,  is  a  carrier  set  of  sort  », .  The  set  and  arrangement  of  arguments 


spec  boolean 
is 

sort 

bool; 

primitive 

op 

true:  ->  bool; 

false:  —  bool; 

not:  —  bool; 

and:  bool, bool  —  bool; 


axiom 

false  =  not  true; 

not  (not  (b))  =  b; 

and(true,b)  =  b; 

and(false,b)  =  false; 

and(bl,b2)  =  and(b2,bl); 

and(and(bl,b2),b3)  =  and(bl,and(b2,b3)); 

end  boolean; 


Figure  2.1:  A  Spec  for  the  Abstract  Type  Boolean 


permits  us  to  specify  both  problem  solving  and  physical  resource  abstractions 
with  equal  facility.  Hence,  we  begin  with  the  specification  of  data  types,  not 
because  this  is  the  best  place  to  start,  but  rather  because  we  can  refer  to  the  large 
body  of  research  on  the  subject. 

A  typical  example  of  an  abstract  data  type  is  boolean  (Figure  2.1).  The 
reader  should  have  no  trouble  seeing  that  all  the  classical  rules  of  inference,  as 
well  as  the  operators  or  and  implies  may  be  derived  from  the  given  axioms  and 
primitive  operators.  In  this  sense,  Figure  2.1  represents  a  minimal  specification 
of  the  type  boolean.  A  student  of  logic  will  note  here  that  several  other 
combinations  of  primitive  operators  will  permit  all  the  others  to  be  derived.  We 
further  note  that  the  names  chosen  for  the  operators,  indeed  for  the  type  itself, 
are  purely  arbitrary. 
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II.  THEORY 


There  are  many  ways  to  write  a  specification.  A  high  level  language  is  an 
example  of  a  specification  with  more  or  less  ambiguous  semantics.  To  achieve 
true  portability,  we  must  be  able  to  demonstrate  the  following  properties  in  our 
implementation: 

-  The  specified  semantics  actually  implemented  on  the  source  machine  are 
completely  unambiguous. 

-  The  implementation  on  the  source  machine  is  "correct". 

Thus,  our  method  of  specification  must  be  formal  enough  to  permit  proofs  of 
correctness.  Although  the  knowledgable  reader  will  know  that  the  provability  of 
usefully  complex  specifications  has  so  far  been  unrealized,  research  conducted  in 
parallel  with  this  study  (Griffin  1984)  has  given  us  reason  to  be  optimistic. 

The  requirement  for  unambiguous  precision  and  provability  leads  us 
naturally  to  a  mathematical  basis  for  our  specification.  Here  we  find  a  significant 
body  of  research  already  in  place  in  the  area  of  abstract  data  type  specification. 
Goguen  (1978)  and  Guttag  (1978)  treat  this  topic  in  great  detail.  We  will  not  do 
so  here.  Instead  we  give  an  overview  of  the  important  concepts  of  abstract  data 
types  and  the  underlying  theory  of  specifications  as  a  preface  to  a  treatment  of 
the  issues.  The  following  discussion  is  a  paraphrase  of  Davis  (1984).  The  reader 
is  directed  to  that  paper  for  more  details. 

A.  ABSTRACT  DATA  TYPES 

A  data  type  is  a  class  of  objects  together  with  a  set  of  operations  which  may 
be  performed  on  those  objects.  An  abstract  data  type  is  a  precise  description  of  a 
class  of  objects  in  terms  of  the  semantics  of  the  operations  which  may  be 
performed  on  the  class.  Our  reasons  for  considering  abstract  data  types  are 
twofold.  First,  in  any  specification  of  a  hardware  resource,  some  mention  of  the 
data  types  it  manipulates  must  be  made.  Second,  although  an  abstract  data  type 
is  primarily  a  problem  solving  resource,  it  may  imply  a  physical  resource  as  well. 
A  stack  is  a  good  example  of  this.  Our  goal  is  to  arrive  at  a  technique  which 


B.  HIGH  LEVEL  LANGUAGES 

The  vast  majority  of  large  software  systems  are  written  in  a  high  level 
language.  Dialectal  variations  between  languages  notwithstanding,  the  problem 
of  porting  software  from  one  hardware  environment  to  another  does  not  involve 
mapping  one  high  level  abstraction  to  another.  It  is  much  more  complex.  It 
involves  the  translation  of  a  relation  between  the  semantics  of  a  language  and  its 
implementation  on  one  machine  into  a  similar  but  not  identical  relation  on 
another  machine.  The  properties  of  this  relation  form  part  of  the  semantic  gap. 

1.  Implementation  and  the  Semantic  Gap 

Implementing  a  problem  solving  abstraction  on  conventional  machines  is 
some'^hat  haphazard  due  in  part  to  difficulties  in  mapping  the  semantics  of  the 
language  onto  the  semantics  of  the  hardware.  It  is  unfortunately  true  that  there 
is  often  no  way  at  all  to  describe  the  meaning  of  a  problem  solving  abstraction  in 
terms  of  the  physical  resource.  This  occurs  because  language  designers  do  not 
want  to  acknowledge  the  existence  of  the  resource  provided  by  the  engineers  and 
because  engineers  do  not  see  the  physical  resource  as  part  of  a  higher  level 
abstraction. 

In  general,  when  we  push  a  physical  resource  abstraction  up  to  meet  a 
problem  solving  abstraction,  the  class  of  languages  which  may  be  efficiently 
implemented  on  that  hardware  is  narrowed.  Likewise,  when  problem  solving 
abstraction  is  pulled  down  toward  the  physical  resource,  we  lose  the  ability  to 
elegantly  map  our  problems  into  a  program. 

The  task  of  implementing  a  language  should  acknowledge  the  semantic 
gap,  not  contribute  to  it.  We  must  therefore  find  ways  to  describe  the  precise 
relationship  between  the  language  semantics  and  the  resource.  Our  methodology 
should  make  this  easier. 

2.  The  Chicken  and  the  Egg 

One  problem  we  will  always  have  to  deal  with  is  where  to  start.  Do  we 
begin  by  defining  a  general  purpose  problem  solving  abstraction,  or  by  defining 
the  resource?  We  have  chosen  the  latter  for  several  reasons.  First,  we  have  come 
to  realize  that  it  is  easy  to  dream  up  problem  solving  abstractions  which  are 
simply  unimplementabie.  Ada  may  be  a  prime  example  of  this.  Second,  an 


understanding  of  the  fundamental  characteristics  of  the  resource  reveals  much 
more  about  the  relationship  between  a  language  and  its  implementation  than  an 
understanding  of  the  semantics  of  the  language.  Third,  we  have  gotten  pretty 
good  at  describing  language  abstractions,  but  have  only  scratched  the  surface  at 
trying  to  formally  describe  a  physical  resource.  Thus,  we  devote  our  efforts  to 
describing  an  abstract  machine. 

C.  THE  PHYSICAL  RESOURCE 

The  ideas  behind  the  concept  of  a  memory  or  a  display  are  not  really  well 
understood.  We  know  they  are  complex  physical  structures,  but  we  have  great 
difficulty  formalizing  what  it  means  to  fetch  or  store  values.  The  primary  cause 
of  this  difficulty  is  the  design  process  itself. 

The  hardware  design  process  is  a  battle  against  the  clock,  against  rising 
complexity,  against  shrinking  space,  where  opportunity  is  expediency,  and  unused 
space  is  a  crime.  That  must  change  if  the  semantic  gap  is  to  be  narrowed 
significantly. 

Complex  components  imply  complex  semantics.  Complex  semantics  imply 
even  more  complex  specifications.  The  conventional  design  goals  of  minimizing 
circuit  complexity,  and  of  maximizing  the  regularity  and  orthogonality  of  the 
instruction  set  architecture  do  not  really  address  the  issue  of  semantic 
complexity.  If  our  goal  is  to  increase  software  portability,  a  way  must  be  found 
to  coalesce  the  many  conflicting  considerations  affecting  the  design  process. 
Admittedly,  the  hardware-software  relationship  is  only  one  of  these 
considerations. 

1.  The  Instruction  Set 

An  important  question  one  might  ask  is,  what  effect  does  the 
specification  methodology  have  upon  the  design  of  the  instruction  set 
architecture?  If  we  are  prevented  from  describing  the  instructions  we  need  for  an 
application,  the  whole  method  loses  much  of  its  appeal.  Interestingly,  we  have 
found  that  the  content  of  the  instruction  set  is  much  more  related  to  the  types  of 
data  we  are  able  to  describe,  rather  than  to  the  method  of  algebraic  specification. 
Representation  independence  renders  meaningless  instructions  like  shift  and 
rotate  because  the  level  of  abstraction  is  necessarily  higher.  In  essence,  the 


physical  resource  is  made  to  look  more  like  a  primitive  problem  solving 
abstraction. 

We  found  also  that  the  typical  issues  facing  the  designer  of  an  instruction 
set,  such  as  timing,  opcode  size,  addressing  modes,  and  such,  tend  to  become 
moot.  However,  consideration  of  regularity  and  orthogonality  --  programming 
language  issues  —  remain  as  important  as  ever.  This  reinforces  the  observation 
that  emphasis  upon  the  resource  as  a  resource  is  considerably  diminished  when 
the  machine  is  modeled  as  an  abstraction. 

2.  Overlap 

The  term  overlap  is  used  here  in  reference  to  the  manner  in  which 
machine  data  types  are  overlaid  upon  a  common  resource,  the  memory.  Overlap 
occurs  in  conventional  machines  because  data  structures  are  overloaded  to 
prevent  the  w’aste  of  valuable  computer  resources.  For  a  typical  word  size  of  32 
bits,  the  practice  is  to  assign  the  character  type  to  a  byte  (8  bits),  short  integers 
to  a  half  word,  long  integers  to  a  word,  and  standard  and  extended  precision 
floating  point  numbers  to  a  word  and  double  word  respectively.  Without  even 
mentioning  the  problem  of  alignment1  it  should  be  clear  to  even  the  casual  reader 
that  describing  the  semantics  of  such  a  memory  system  would  be  a  mess.  We 
borrow  from  Giegerich  (1983)  to  illustrate  this  point.  We  define  a  function 
overlap  which  accepts  two  cell  identifiers  and  returns  true  when  overlap  exists 
between  the  cells,  false  if  not.  If  we  assume  even  alignment  for  16-bit  words  and 
32-bit  longwords.  then  the  overlap  axiom  relating  16  and  32-bit  words  would  be 

overlap  ( .V/  16  a  .M  32  6)  -  (6  $  a  <  b  ^2) 

where  u  and  6  are  addresses,  and.  of  course,  overlap  is  commutative.  Now, 
imagine  having  to  specify  a  set  of  overlap  axioms  relating  each  data  type  to  every 
other  data  type,  and  then  having  to  specify  them  everywhere  they  applied  to 
axioms  throughout  the  specification!  What  makes  this  even  worse  is  it  can  be 
shown  that,  for  certain  configurations  of  memory,  there  may  be  an  infinity  of 
such  axioms.  Therefore,  we  avoid  overlap. 


1  Many  machines  require  types  larger  than  one  byte  to  be  aligned  on  an  even  address. 


D.  SPECIFICATIONS 

Algebraic  specifications  impose  restrictions  upon  the  class  of  objects  we  can 
describe.  Although  a  benefit  from  this  is  that  it  forces  us  to  think  very  carefully 
about  the  objects  we  are  attempting  to  specify,  it  is  important  not  to  allow  the 
methodology  to  restrict  our  thinking.  That  this  can  easily  happen  has  been 
demonstrated  over  and  over  again  with  programming  languages.  Experienced 
programmers  are  masters  of  idiom.  But  mastering  the  "tricks"  of  particular 
specification  language  should  not  be  considered  a  goal. 

1.  Notation.  Syntax  and  Semantics 

Although  the  notation  is  theoretically  arbitrary,  the  design  of  a 
specification  language  is  at  least  as  difficult  as  designing  a  programming 
language,  perhaps  more  so.  Abstract  algebra  already  has  a  body  of  accepted 
notation/and  familiarity  with  it  tends  to  bias  one's  ideas  about  how  to  go  about 
designing  a  language.  Some  of  the  key  points  to  remember  are: 

-  The  grammar/syntax  should  support  automated  parsing. 

-  The  language  should  not  make  it  easy  for  the  writer  to  specify  things  which 
cannot  possibly  describe  physical  objects  (such  as  an  object  with  an  infinite 
number  of  terms). 

-  The  language  should  be  human  readable  since  anything  usefully  complex  will 
be  difficult  enough  to  understand  without  requiring  the  reader  to  wade 
through  syntax  to  determine  the  meaning  of  a  specification. 

The  relationship  between  a  language  and  the  semantics  it  is  intended  to 
express  is  often  difficult  to  understand.  Indeed,  this  fact  is  one  of  the  reasons  for 
this  study.  That  the  meaning  of  a  block  of  statements  in  a  specification  depends 
upon  a  complex  mathematical  theory  does  not  make  this  relationship  any  easier 
to  discern.  Notation  and  syntax  should,  in  the  worst  case,  have  no  effect 
whatsoever  upon  the  expressibility  of  the  abstraction. 

In  a  programming  language,  the  symbols  which  make  up  a  program 
represent  abstract  objects  with  which  most  of  us  are  familiar.  The  fact  that  a 
specification  language  "looks"  and  "feels"  like  a  programming  language  is  not 
necessarily  a  good  thing.  On  the  pro  side,  similarities  between  an  algebraic 
specification  language  and  procedural  programming  languages  help  those 
unfamiliar  with  the  methodology  to  understand  how  to  describe  abstractions. 


Unfortunately,  this  "understanding"  is  tainted  by  the  knowledge  most  of  us  have 
gained  through  years  of  experience.  It  will  not  do  to  explain  to  a  budding 
specification  writer,  "You  can't  use  the  name  of  a  sort  as  an  argument  to  an 
operator  because  a  sort  is  just  an  index  into  a  set  of  carriers." 

There  is  one  very  important  difference  between  a  programming  language 
and  a  specification  language.  The  semantics  of  a  programming  language 
construct  or  of  a  particular  statement  in  a  program  may  be  ambiguous  for  any 
number  of  reasons.  The  language  may  be  poorly  defined,  there  may  be  several 
"dialects"  in  use,  and  of  course,  the  compiler  writer  may  have  erred  during  the 
implementation.  Although  the  latter  case  is  still  possible  in  the  implementation 
of  a  specification,  one  thing  is  certain  --  the  meaning  of  a  particular  axiom  is 
completely  defined.  We  may  not  know  what  we  have  written,  we  may  think  it 
means  something  it  does  not,  we  may  even  have  expressed  a  built-in  ambiguity2, 
but  the  true  meaning  of  an  axiom  is  completely  determined  by  the  underlying 
theory  we  discussed  in  Chapter  2.  The  problem  is  figuring  out  what  that 
meaning  is.  Unfortunately,  one  of  the  most  important  results  of  actually 
designing  and  implementation  a  specification  is  that  we  discover  there  is  just  no 
easy  way  to  find  this  out.  We  cannot  even  be  certain  that  an  incorrectly 
specified  abstraction  will  be  guaranteed  to  fail  when  it  is  implemented,  because 
any  implementation  is  at  best  a  finite  instantiation  of  a  subclass  of  objects 
described  in  the  specification.  One  implementation  may  work  fine  because  the 
values  which  uncover  the  ambiguity  are  simply  not  defined,  while  another,  less 
restrictive  implementation  may  not  work  at  all.  We  will  return  to  this  issue 
again  in  our  discussion  of  the  implementation. 

We  have  already  noted  that  errors  are  difficult  to  handle  in  algebraic 
specifications.  It  is  not  that  they  are  difficult  to  express,  nor  is  it  that  it  is 
difficult  to  determine  where  errors  might  occur.  Rather,  it  is  that  a  formal 
treatment  of  errors  usually  results  in  an  explosion  of  extra  terms  due  to  a 
tremendous  increase  in  the  number  and  complexity  of  axioms,  which  must  be 
modified  to  account  for  these  "boundary  conditions".  All  we  have  to  say  about 

2  An  axiom  which  evaluates  to  two  different  terms,  depending  upon  the  order  of  evaluation, 
is  explicitly  ambiguous 
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this  is,  we  realize  it  is  a  difficult  problem  which  must  be  solved,  and  that  we  have 
no  good  solution  for  it. 

2.  Parameterized  Specifications 

The  more  complex  the  object  we  are  attempting  to  describe,  and 
particularly,  the  more  general  a  class  of  operations,  the  more  likely  it  is  that  a 
parameterized  specification  will  be  required.  Since  the  meaning  and  method  of 
expressing  parameterized  specifications  are  highly  disputed,  we  have  used  it  only 
once  in  our  specification  —  to  describe  a  data  type  for  character  strings. 

Parameterized  specifications  provide  an  additional  level  of  abstraction  to 
those  we  described  in  Chapter  2.  They  specify  a  template  onto  which  the  sorts 
and  operators  of  another  specification  must  be  mapped.  This  mapping  is  one-to- 
one.  The  axioms  and  operations  expressed  in  the  body  of  the  parameterized 
specification  become  available  to  the  parameterized  type  when  it  is  instantiated. 
Parameterized  specifications  make  the  already  difficult  task  of  determining  the 
properties  of  the  carrier  sets  even  more  difficult. 

spec  A 
end  A 

spec  B 

extend  A 

end  B 
spec  C 

extend  A 

end  C 
spec  D 

extend  B.C 

end  D 

Figure  3.2:  The  Problem  with  Extension 
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3.  Extension 

The  concept  of  extension  is  somewhat  analogous  to  the  way  attributes 
may  be  added  to  an  object  in  an  object  oriented  language,  such  as  Smalltalk.  An 
existing  specification  is  "extended"  with  the  addition  of  more  sorts,  operators 
and/or  axioms.  Extension  is  required  because  the  process  of  building  the  abstract 
specification  involves  continuously  adding  to  existing  specifications,  moving  from 
low  level  primitives,  through  higher  and  higher  levels  of  abstraction.  The  reader 
will  note  that  this  is  a  classic  example  of  bottom  up  design.  The  algebraic 
specification  methodolgy  we  use  here  requires  it. 

A  serious  problem  with  extension  involves  the  proliferation  and 
duplication  of  specifications  through  the  abstraction  hierarchy.  Figure  3.2 
illustrates  this.  Notice  that  specs  B  and  C  are  extensions  of  A.  But  D  extends  B 
and  C,  so  there  are  now  two  "copies"  of  A  in  D.  The  analogy  to  scoping  an  a 
programming  language  looks  attractive,  but  is  very  weak,  if  not  incorrect.  It  is 
closer  to  the  concept  of  multiple  inheritance  in  an  object-oriented  language. 
When  we  say  extension  adds  new  operators,  axioms  and  sorts  to  an  existing 
specification,  we  really  mean  "adds  new  objects  and  rules  to  an  existing  collection 
of  objects  and  rules.  Illustrated  in  Figure  3.2  is  the  addition  of  a  specification  to 
itself  (A  on  A).  What  effect  does  this  have  upon  the  semantics  we  are 
describing?  Most  references  do  not  treat  this  problem. 
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IV.  DESIGN 

Athough  the  literature  is  filled  with  examples  used  to  illustrate  how  one 
might  specify  abstractions,  few  provide  a  practical  treatment  of  the  problem  of 
designing  a  working  system.  This  study  has  been  an  attempt  to  bring  the  theory 
down  to  earth  —  to  show  that  it  really  is  possible  to  use  algebraic  specifications 
to  design  something  which  we  not  only  can  talk  about,  but  which  we  can  actually 
use. 

A.  THE  SPECIFICATION  LANGUAGE 

Appendix  A  contains  a  high  level  grammar  for  our  specification  language, 
which  is  similar  to  examples  found  in  the  literature,  with  changes  to  give  it  the 
feel  of  a  programming  language.  A  "module"  in  the  specification  is  called  a  spec. 
The  entire  specification  forms  a  hierarchy  of  specs  which  are  related  to  one 
another  through  the  operation  of  extension  which  we  described  in  the  previous 
chapter.  Each  spec  may  introduce  zero  or  more  new  sorts,  operators  and/or 
axioms,  which  may  be  added  to  an  existing  spec  through  extension,  or  which  may 
form  the  primitives  of  a  new  "branch"  of  the  hierarchy.  Although  it  is 
conceivable  that  one  might  specify  an  object  composed  of  disjoint  specs,  this  is 
not  the  usual  case.  Extension  provides  the  only  means  of  relating  the  carriers 
and  operators  described  in  two  different  specs1. 

Our  language  also  permits  the  use  of  parameterized  specifications,  although 
we  minimize  their  use  because  their  properties  are  not  well  understood. 

We  avoid  a  detailed  description  of  the  syntactic  sugar,  since  this  is  essentially 
arbitrary.  The  semantics  and  overall  structure,  however,  is  not.  For  example,  all 
symbols  must  be  unique.  No  symbol  may  be  used  unless  it  has  first  appeared  as 
the  name  of  a  spec,  in  a  sort  definition,  or  to  the  left  of  a  colon  in  an  operator 
definition.  This  rule  guarantees  that  at  no  time  are  the  properties  of  the  object 
inferred  from  the  name  ambiguous.  Thus,  the  structure  of  a  specification  is  much 

1  There  are  several  other  operations  by  which  two  specifications  may  be  related  They  are 
discussed  in  Fasel  (1983)  We  do  not  use  them  here. 
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like  a  Pascal  program,  but  more  restrictive.  There  are  no  self  referential  specs, 
and  no  use  of  a  spec  before  it  has  been  defined. 

The  language  also  introduces  the  idea  of  primitive,  derived,  hidden  and  error 
operators.  Primitive  operators  are  those  which  must  be  implemented  to  provide  a 
full  instantiation  of  the  specification.  Derived  operators  are  simply  that  — 
derived  from  the  primitives.  The  implementor  may  elect  to  ignore  these,  secure 
in  the  knowledge  that  their  functions  may  be  performed  by  composition  of 
primitives.  In  our  specification  for  boolean,  or  and  implies  are  derived 
operators.  Error  operators  accept  no  arguments.  They  are  guaranteed  to  return 
a  value  of  the  result  sort  which  must  be  an  error.  The  need  for  such  operators 
and  their  limitations  are  described  in  detail  in  Goguen  (1978).  We  found  them, 
in  practice,  to  be  a  nuisance.  We  will  return  to  the  issue  of  errors  in  our 
discussion  of  the  implementation.  Hidden  operators  are  those  to  which  the 
programmer  has  no  access.  They  represent  abstractions  of  the  machine  required 
to  express  a  certain  semantics  but  nothing  more. 

1.  The  Macro  Preprocessor 

One  of  the  things  we  quickly  realized  as  the  specification  became  more 
complex  was  that  the  writer  of  a  specification  spends  a  lot  of  time  writing  the 
same  thing,  over  and  over  again.  This  occurs  whenever  the  specification  calls  for 
the  description  of  a  number  of  general  purpose  operators  which  operate  on 
elements  of  a  number  of  different  carries  through  the  use  of  a  mapping  function. 
Our  fetch  and  store  operators  are  an  example  of  this.  They  are  capable  of 
storing  and  retrieving  values  of  any  type  to  and  from  primary'  storage.  All  the 
AM  data  types  map  into  a  common  type,  value,  which  is  passed  to  or  returned 
from  fetch  and  store.  The  spec  which  describes  the  mapping  function  for  each 
type  is  virtually  identical  except  for  the  names  of  operators  and  sorts.  Thus,  we 
introduced  a  partially  defined,  imaginary  macro  preprocessor  which  provides  for 
macros  with  parameters.  The  reader  will  see  examples  of  its  use  throughout  the 
specification. 

The  basic  form  of  a  macro  definition  is 

replace  "text..."  with  "other  text..." 

Since  the  lexics  of  our  specification  language  does  not  require  quotes,  they  are 
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used  as  delimiters  for  definition  and  equivalence  strings.  A  macro  with 
arguments  looks  like 

replace(A.B . Z)  "text . "  with  "other  text . " 

where  the  formal  parameters  must  be  capital  letters.  Since  we  do  not  allow 
uppercase  letters  within  the  spec  itself,  an  uppercase  letter  denotes  a  formal 
parameter  to  a  macro.  Thus  for  the  definition 

replace(  A) 

"convert(A)" 

with 

"atomofA:  val  -  S" 

then  the  string 

convert  (bool) 

would  be  replaced  by 

atomofbool:  val  -  bool 

wherever  it  appeared. 

B.  THE  MACHINE 

AM  is  a  abstract  machine  whose  overall  concept  is  based  upon  a  simple 
design  put  forward  by  Fasel  (1980).  Appendix  B  contains  the  specification  which 
describes  it.  and  Appendix  C  contains  the  programmer's  manual  for  a  simple 
assembler  which  produces  native,  relocatable  AM  object  code. 

Now  that  we  have  the  theory  upon  which  to  base  a  specfication,  the  next 
important  question  to  answer  is,  what  do  we  design?  Our  stated  goal  has  been  to 
contribute  to  solving  the  portability  problem  by  attacking  the  semantic  gap. 
But.  not  only  must  we  design  a  machine,  we  must  also  remember  and  analyze  the 
process  of  designing  it.  Therefore,  we  treat  now  this  process,  discussing  our 
fundamental  design  decisions  and  the  reasons  behind  them. 

At  the  time  of  this  writing  there  are  many  examples  of  advanced  special  and 
general  purpose  architectures.  Some  of  the  big  names  are  RISC  (Patterson  1982) 
and  various  language  directed  architectures  (Waite  1975.  Hoffman  1982  and 
Myers  1982).  After  a  survey  of  these  and  other  references,  we  decided  to  put  off 
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talking  about  the  architecture  until  we  can  see  what  sorts  of  things  we  might 
describe  and  understand  with  our  specification  language. 

In  his  PhD  thesis,  Fasel  (1980)  describes  a  simple  abstract  machine  called 
SAM  (  Single  Accumulator  Machine).  After  wading  through  his  spec  and  some 
preliminary  attempts  to  specify  a  few  objects  of  greater  complexity,  we  decided  to 
model  a  conventional  architecture.  We  have  two  very  good  reasons  for  this. 
First,  since  we  are  all  familiar  with  the  typical  Von  Neumann  processor,  we 
should  be  more  likely  to  find  good  ways  to  formally  describe  it.  Second,  this 
same  familiarity  should  make  it  more  likely  for  others  to  understand  our 
specification. 

The  next  step  is  determining  where  to  start.  This  is  not  too  difficult.  The 
operation  of  every  machine  can  be  reduced  to  a  complex  sequence  of  simple 
operations.  At  a  level  of  abstraction  below  the  basic  data  element  and  its 
primitives  we  should  be  required  to  specify  the  semantics  of  processing  elements 
and  control  stores.  At  a  level  above  the  basic  data  element  we  would  merely  be 
adding  another  to  the  long  list  of  Von  Neumann  programming  languages. 
Therefore,  we  use  as  a  basis  for  the  specification,  the  primitive  data  types.  In  the 
interests  of  simplicity,  we  chose  five:  boolean,  natural  (unsigned),  integer, 
character  and  string.  These  form  the  atomic  data  types,  referred  to  hereafter  as 
atoms. 

Data  types  implemented  on  conventional  architectures  exhibit  a  built-in 
dependence  upon  the  way  in  which  values  are  represented  in  hardware.  This 
arises  naturally  from  design  goals  which  stress  storage  efficiency,  and  leads  to 
several  undesirable  properties.  First,  machine  data  structures  are  overloaded. 
Given  an  arbitrary  address,  not  only  can  we  not  tell  what  type  of  data  we  have 
accessed,  we  can  not  even  determine  with  certainty  if  we  have  accessed  all  of  it. 
We  might  be  in  the  middle  of  a  floating  point  number  or  on  the  end  of  a 
character  string.  Second,  nothing  prevents  a  programmer  from  treating  one  type 
of  data  as  another.  Third,  the  "state"  of  a  machine  is  impossible  to  analyze. 
The  endless  string  of  bits  characterizing  the  "meaning"  of  a  program  at  a 
particular  instant  provides  no  hope  for  proving  something  about  the  program's 
correctness.  We  therefore  offer  an  architecture  which  will  rationalize  the 
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relationship  between  data  and  the  machine,  but  which  can  be  implemented  easily 
either  by  emulation  or  through  direct  hardware  means. 

An  abstract  machine  which  solves  these  problems  must  have  the  following 
properties: 

-  In  the  organization  of  primary  storage,  the  next  logical  data  item  is  in  the 
next  logical  address. 

-  Except  as  formally  specified,  no  data  type  may  be  accessed  in  any  way  as 
another. 

-  Given  any  arbitrary  logical  address,  the  value  stored  there  and  its  type  can 
always  be  determined. 

Hence,  we  use  a  tagged  architecture  with  some  very  special  characteristics’, 
which  takes  away  some  of  the  programmer's  freedom  to  "twiddle"  bits.  The 
resource  provided  by  this  architecture  will  now  be  partitioned  into  functional 
areas  along  the  lines  of  a  conventional  machine. 

Typical  resources  available  at  the  instruction  set  level  include  the  primary 
storage,  high  speed  registers,  stacks,  I/O  ports  and  perhaps  a  heap.  We  will 
define  abstractions  for  each  of  these.  Here,  again,  we  see  a  marked  difference 
between  the  conventional  view  of  the  physical  resource  and  that  imposed  by  our 
specification  method.  Ports,  stacks  and  the  heap  are  usually  thrown  right  in  with 
the  rest  of  the  program  and  data.  In  fact,  as  we  have  said  in  Chapter  1,  stacks 
are  often  accessed  as  arrays.  AM  treats  each  of  these  resources  as  a  black  box. 
One  may  push,  pop  and  read  the  top  of  a  stack,  but  the  stack  pointer  is 
inaccessible,  as  are  any  values  below  the  top  element  (unless  one  pops  the  stack 
to  reach  them).  We  thus  remove  another  freedom  once  enjoyed  by  the 
programmer  —  that  of  treating  one  type  of  data  structure  as  another. 

A  conventional  instruction  set  forms  an  abstraction  closely  tied  to  the 
representation  of  data  in  the  hardware.  Our  architecture  makes  this  impossible. 
Instead,  whatever  instruction  set  we  design  will  become  much  closer  to  a 
primitive  problem  solving  abstraction.  Again  in  the  interest  of  simplicity  and 
understanding,  we  define  an  instruction  set  which  should  be  thoroughly  familiar 
to  most  readers  who  have  programmed  in  assembly  language. 

A  proposal  for  a  hardware  implementation  with  these  properties  is  given  in  Yurchak  (1984) 
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The  restrictions  upon  the  programmer  s  freedom  which  we  have  discussed  are 
justified  because  by  giving  up  the  ability  to  do  almost  anything  we  can  imagine, 
we  gain  the  ability  to  explicitly  specify  our  intent  during  the  course  of  a  program. 
The  specification  does  not  specify  what  a  resource  is  or  how  it  is  implemented.  It 
does  specify  exactly  what  the  resource  means  and  how  to  use  it. 

AM  is  an  abstraction  of  a  conventional  Yon  Neumann  resource  with  some 
unconventional  properties.  The  primary  (only)  machine  element  is  called  a  value. 
All  data  primitives  (atoms)  map  into  values.  Primary  storage  is  an  array  of  one 
or  more  memory  segments,  each  of  which  may  contain  an  arbitrary  number  of 
cells.  Each  cell  is  capable  of  "containing"  any  legal  data  value.  Both  programs 
and  data  may  reside  together  in  a  single  segment.  For  high  speed  storage,  there 
are  one  or  more  register  segments,  each  of  which  contains  an  arbitrary’  numbr  of 
registers.  Again,  every  register  is  capable  of  containing  any  type  of  data.  AM 
also  has  one  or  more  stacks,  a  heap,  and  a  crude  file  system.  We  will  discuss  the 
details  in  the  next  section. 

The  basic  atomic  data  types  are  augmented  by  several  others  needed  for  the 
execution  of  programs.  These  are  memory  addresses,  register  addresses,  stack 
addresses,  file  addresses  and  instructions. 

C.  THE  SPEC 

The  specification  for  AM  is  contained  in  Appendix  B.  The  language  used  to 
describe  it  obeys  the  grammar  found  in  Appendix  A.  We  will  discuss  the 
specification  in  some  detail  since  portions  are  nonintuitive. 

1.  Macro  Definitions 

At  the  top  of  the  specification  are  listed  a  number  of  macro  definitions. 
We  concern  ourselves  for  the  time  being  with  just  those  definitions  pertaining  to 
the  properties  of  relations.  The  intended  properties  of  certain  operators  will 
require  that  we  express  axioms  for  commutativity,  transitivity,  etc.,  throughout 
the  specification.  Rather  than  write  this  out  repeatedly,  we  define  macros  with 
appropriate  parameters  which  permit  a  more  readable  and  explicit  expression  of 
these  properties.  Take,  for  example,  the  equality  operator  for  integers. 

eqint:  int.int  -  bool: 
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which  returns  true  if  the  arguments  are  equivalent,  false  if  not.  We  should  like 
to  express  that  eqint  is  an  equivalence  relation  on  objects  of  type  int.  Thus,  we 
need  the  following  axioms: 

eqint(i.i)  =  true; 

eqint  (i.j)  =  eqint(j.i): 

implies(and(eqint(i  j),eqint(j,k)),eqint(i.k))  =  true; 

But  there  are  relations  like  this  one  throughout  our  specification.  Thus  we  define 
macros  like 

replace(X.S) 

"equivrel(X.S)" 

with 

"for  i  in  S 

X(i,i)  =  true; 
for  i.j  in  S 

X(i  j)  =  X(j.i); 

for  ij,k  in  S 

implies(and(X(i  j),X(j,k)),X(i,k))  =  true" 
which  permits  us  to  write,  in  the  case  of  eqint, 

equivrel  (eqint,  int); 

We  then  read  this  as  "eqint  is  an  equivalence  relation  on  int  ".  Note  that  wfe 
are  not  required  to  explicitly  specify  the  type  of  free  variables,  since  this  can 
normally  be  determined  by  context.  We  do  so  in  the  interest  of  clarity,  since 
there  can  be  no  doubt  for  which  type  eqint  is  an  equivalence  relation. 

For  the  reader  who  doubts  that  the  more  complex  macros  described  in 
this  specification  will  work,  a  modified  version  of  the  familiar  M-4  macro 
preprocessor5  will  correctly  deal  with  every  macro  found  in  our  specification. 

2.  The  Atomic  Types 

The  basic  data  types  form  the  primitive  objects  of  the  problem  solving 
abstraction.  The  programmer's  algorithm  must,  in  some  way  be  mapped  into 
these  abjects.  Boolean  is  described  first  because,  not  only  is  it  a  data  type 
available  for  use  by  the  programmer,  it  is  also  part  of  the  specification  itself. 

Spp  Kernighan  arul  Rurhie.  The  M4  Macro  Preprocessor.  Hell  Laboratories.  Murray  Hill. 
New  Jersey  July  1974 
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lodifying  a  few  constants.  Only  one  module  of  our  interpreter  need  be 
'compiled  to  make  this  alteration. 

1.  MAPPING  OPERATORS  TO  FUNCTIONS 

It  seems  natural,  although  incorrect,  to  look  at  the  operators  in  a  spec  as 
motions.  However,  in  the  implementation,  this  makes  perfect  sense.  Figure  5.4 
,sts  the  code  for  the  AM  module  which  implements  the  boolean  type.  The 
.eader  Files  which  provide  the  constant  definitions  are  omitted  here.  Notice  that, 
ohere  possible,  we  rely  upon  the  operations  provided  by  the  C  language,  rather 


BOOL  atomofbool(v) 

VAL  v; 

{ 

BOOLb; 

if  (v.type  !=  BOOLJVAL) 

error("value  not  of  type  BOOL  -  ^x"; v.type): 
b.type  =  BOOL_TYPE: 
b.val  =  v.boolval.val; 
return(b): 

} 

VAL  valofbool(b) 

BOOLb: 

{ 

VAL  v; 

if  (b.type  !=  BOOLJTYPE) 

error("atom  not  of  type  BOOL  -  Tx", b.type): 
v.boolval.type  =  BOOL  VAL: 
v.boolval.val  =  b.val: 
return(v); 

} 


Figure  5.5:  Error  Handling 


BOOL  true  =  {  BOOL_TYPE.  1  }: 
BOOL  false  =  {  BOOLJTYPE.  0  }: 

BOOL  not(a) 

BOOLa: 

{ 

a.val  =  la.val; 
return(a); 

} 

BOOL  and(a.b) 

BOOL  a, b: 

{ 

a.val  =  (a.val  &:&:  b.val); 
return(a) : 

} 

BOOLor(a.b) 

BOOL  a, b; 

{ 

a.val  =  (a.val  ||  b.val); 
return(a); 

} 


BOOLeqbool(a,b) 

BOOL  a, b; 

{ 

a.val  =  (a.val  ==  b.val); 
return  (a): 

} 


BOOL  nebool(a.b) 

BOOL  a. b: 

{ 

a.val  =  (a.val  !=  b.val): 
return(a): 

} 


Figure  5.4:  Operator-Function  Mapping 
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typedef  struct  { 

int  size; 

VAL  **val: 

}  memseg; 
typedef  struct  { 

int  num; 

VAL  *  *val: 

}  regseg; 
typedef  struct  { 

int  size; 
int  sp; 

VAL  ' " val; 

}  stkseg: 
typedef  struct  { 

int  stat; 
int  mode; 
int  type: 
int  val; 

}  fileseg; 

#define  _Nl\WMEMSEG  1024 

#  define  JVUMUSRSEG  2 

#  define  _NUMREGSEG  1 

# define  _NUN1STKSEG  1 

^define  _NUMFILES  16 

memseg  _rnem[_XUMMEMSEG]  =  { 

1024.0, 

1024,  0  }: 

regseg  reg[_Nl*MREGSEGl  =  { 

32.  0  }: 

stkseg  _stk[_NUMSTKSEG]  =  { 

512.512.0  }; 

fileseg  _file[_NUMFILES)  =  { 

l.RMODE.CHAR_V'AL.O, 
l.WMODE.CHAR_VAL.l. 
l'vVMODE.CHARJV'AL.2  } 


Figure  5.3:  The  Physical  Resource 


51 


interpreter.  Each  atom  is  represented  as  a  structure  consisting  of  a  16-bit  tag 
field,  and  a  value  field.  The  size  of  the  value  field  varies  with  the  type.  Each 
sort  in  the  specification  is  assigned  a  sixteen  bit  code.  Whenever  an  atom  is 
created,  or  copied,  it  is  tagged  with  the  appropriate  code. 

By  using  a  fixed  size  tag  field  as  the  first  field  in  each  record,  we  build  in 
some  additional  robustness,  since  even  in  the  event  of  a  mistyped  structure  being 
copied  into  the  formal  parameter  of  a  function,  we  can  rely  upon  the  first  word  to 
be  a  valid  code  (the  type). 

The  next  step  is  to  describe  the  structure  for  machine  values,  which  must  be 
capable  of  containing  any  atom.  This  is  more  difficult.  We  resort  here  to 
subterfuge.  Our  specification  method  relies  upon  the  extend  operation  to  build 
more  and  more  complex  specifications.  Unfortunately,  there  are  few  Von 
Neumann  languages  which  permit  additions  to  the  definition  of  a  data  type  once 
the  compiler  has  seen  it.  In  C,  we  cannot  specify  directly  two  structures  which 
contain  each  other.  So,  we  resort  to  the  technique  illustrated  in  Figure  5.2.  The 
problem  is  caused  by  the  type  instr  which  represents  the  opcode  returned  when 
each  instruction  operator  is  invoked.  These  instr  atoms  must  contain  values  for 
their  operands  (as  part  of  the  opcode),  but  are  themselves  values,  since  we  must 
be  able  to  store  and  fetch  instructions.  How  else  would  we  get  a  program  into 
memory  and  execute  it?  The  solution  is  to  fool  C  into  thinking  we  are  talking 
about  pointers  to  structures  instead  of  structures  themselves.  This  works  fine 
since  we  implement  an  instruction  opcode  as  a  structure  whose  value  field  is  a 
pointer  to  the  opcodes. 

The  primary  physical  resources  are  also  defined  as  structures  (Figure  5.3). 
Registers,  primary  storage  and  stacks  are  represented  as  arrays  of  arrays  of 
pointers  to  values.  The  reader  should  note  that  a  simple  change  to  the  constants 
in  the  header  files  can  completely  alter  the  configuration  of  the  machine.  We  can 
specify  an  arbitrary  number  of  arbitrarily  long  memory  segments  and  register 
segments,  and  an  arbitrary  number  of  different  sized  stacks.  Files  are  represented 
as  usual  as  an  array  of  structures  containing  status  information  and  an 
input/output  buffer.  The  number  and  type  of  files  can  also  be  changed  by 
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A.  IMPLEMENTING  DATA  TYPES 


As  in  any  piece  of  substantia!  software,  we  look  first  at  the  data  structures 
required  to  support  our  algorithm  (which  in  this  case  was  represented  by  the 
specification).  We  chose  C  as  it  appeared  to  provide  the  easiest  translation  from 
the  specification.  In  retrospect,  Lisp  would  work  very  well,  too. 

AM  is  a  tagged  architecture.  Each  data  element  must  be  self  descriptive. 
The  most  likely  construct  to  provide  this  is  a  structure  (record),  and  this  is  what 
we  used.  Figure  5.1  lists  some  fragments  from  the  header  files  used  by  our 


typedef  short  opcode: 

typedef  struct  { 

short  type: 
union  value  *val; 

}  INSTR; 

typedef  union  value  { 
short  type; 

opcode  opcodeval; 

BOOLboolval; 

INT  intval; 

NAT  natval; 
CHARcharval; 

STRING  stringval; 

MEMADDR  memaddrval; 
REGADDR  regaddrval; 
STKADDR  stkaddrval: 
FIL  fileval; 

INSTR  instrval; 

MOP  mopval: 

DOP  dopval: 

RELOP  relopval; 

BOP  bopval; 

}  VAL; 


Figure  5.2:  Machine  Values 


V.  implementation 


AM  is  implemented  as  a  finite  state  machine  interpreter.  It  comprises 
approximately  12000  lines  of  C  code,  including  the  assembler.  Details  of  the 
assembler  are  treated  in  Appendix  C.  The  overall  concept  is  quite  simple.  A 
text  file  representing  an  assembly  language  program  is  translated  by  the  assmbler 
into  a  relocatable  object  module.  A  loader,  part  of  the  AM  interpreter,  loads  this 
object  module  into  the  appropriate  cells,  and  AM  executes  it. 

There  are  only  four  issues  of  real  interest  concerning  the  details  of  the 
implementation.  These  are  the  representation  of  data  types,  the  mapping  of 
operators  in  the  specification  to  functions  in  the  interpreter,  the  handling  of 
errors,  and  the  execution  of  a  program. 

The  AM  interpreter  is  a  fairly  large  program  by  most  standards.  We  feel  it 
notable  that  the  period  of  time  from  completion  of  the  specification  to  a  working 
version  of  the  interpreter  spanned  just  two  weeks!  We  attribute  this  level  of 
productivity  largely  to  the  existence  of  the  specification,  which  left  absolutely  no 
doubt  about  the  meaning  of  operations.  Once  a  few  mechanical  obstacles  had 
been  bridged,  writing  the  program  was  largely  repetition. 


# define  NAT_TYPE  0x0002 

typedef  unsigned  intnat; 

typedef  struct  { 

short  type: 
nat  val; 

}  NAT; 


Figure  5.1:  Type  Definitions 
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To  completely  specify  the  effects  of  errors  a  set  of  axioms  must  be 
supplied  for  each  operator.  We  avoid  this  in  the  interest  of  simplicity,  but 
caution  the  reader  that  errors  have  not  been  properly  treated  here. 

We  guarantee  there  are  errors  in  our  specification,  and  encourage  to 
reader  to  do  as  we  have  done:  stare  at  the  specification,  and  thoroughly  test  its 
implementation.  We  note  that  the  design  and  implementation  of  a  specification 
language  brings  with  it  the  host  of  problems  which  follow  more  conventional 
programming  languages.  We  do  not  have  a  w>ay  of  determining  if  the 
specification  is  correct,  let  alone  whether  or  not  it  describes  what  we  want  it  to. 

However,  we  have  demonstrated  how  something  of  useful  complexity  can 
be  described  using  algebraic  specifications.  By  moving  toward  the  problem 
solving  abstraction  from  the  resource  side,  we  require  data  to  be  manipulated  in  a 
representation  independent  way.  We  have  also  shown  that,  by  capturing  the  true 
meaning  of  the  machine's  data  structures  in  a  specification,  we  remove  the 
semantic  ambiguities  usually  encountered  where  a  resource  oriented  instruction 
set  meets  a  problem  oriented  language.  The  instruction  set  becomes  a  medium 
through  which  we  may  unambiguously  express  our  intent  in  a  program. 


This  axiom  is  used  to  "fire  off  a  program.  The  progression  from  one  instruction 
to  another  is  given,  as  one  might  suspect,  in  the  axioms  for  each  instruction.  For 
example,  consider  the  move  memory-to- memory  instruction  (mov  m_m).  The 
axiom  which  defines  its  semantics  is 

xeq(mov_m_m(ml,m2),m,q)  — 

prog(nextmemaddr(m),storem(fetchm(ml,q),m2,q)); 

which  means: 

The  state  resulting  from  the  execution  of  mov _m_m  with  operands  ml  and 
m2  at  address  m  in  state  q  is  equivalent  to  the  state  of  the  program  at  the 
next  address,  after  what  is  fetched  from  ml  is  stored  in  m2. 

Notice  that  the  q's  in  the  axiom  are  identical  (refer  to  the  same  state).  The 
reader  should  see  that,  through  this  axiom,  we  have  fetched,  decoded  and 
executed  an  instruction,  and  incremented  the  program  counter  (m).  The  other 
axioms  in  the  spec  express  exactly  the  same  relationship  between  xeq  and  prog. 

Sequencing  must  be  expressed  as  a  nesting  of  operators.  Thus,  the 
execution  of  an  AM  program  amounts  to  a  non-deterministic  recursion  between 
xeq  and  prog.  Cleverness  on  the  part  of  the  implementor  is  required  to  enable 
AM  to  execute  programs  of  useful  length. 

6.  Remarks 

The  reasons  for  various  distinctions  among  objects  which,  in  a 
conventional  design,  would  more  intuitively  be  lumped  together  are  often  subtle. 
However,  they  reflect  a  conscious  effort  to  capture  the  abstraction  of  a  machine 
at  a  level  low  enough  to  provide  a  degree  of  flexibility  in  writing  the  axioms 
which  define  its  semantics.  The  higher  the  level  of  abstraction,  the  more  difficult 
it  is  to  infer  a  direct  correspondence  between  the  resource  and  the  specification 
which  describes  it. 

We  should  at  this  point  rationalize  our  error  operations  and  move  as 
many  as  possible  into  a  dedicated  error  spec,  where  they  can  be  properly  handled 
as  a  whole.  The  specification  in  Appendix  B  does  not  reflect  this.  The  result  is 
that  error  ops  and  axioms  are  scattered  throughout  the  specification. 
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Heap  operators  are  provided  to  permit  the  dynamic  allocation  of 
arbitrary  sized  memory  segments  for  constructing  linked  lists,  frames  and  other 
structures,  lalloc  returns  the  identifier  to  a  segment  of  n  cells,  lfree  deallocates 
it,  returning  it  to  the  heap.  The  indir  operator  has  been  designed  expressly  to 
permit  up-level  addressing  through  a  list  of  frames  allocated  using  lalloc. 

The  file  operators  are  not  really  part  of  AM.  They  resemble  a  set  of 
typical  operating  system  service  calls.  We  provide  them  to  enable  AM  to 
communicate  with  the  outside  world.  Files  are  much  more  interesting  than  the 
other  structures,  since  their  semantics  resemble  the  operation  of  an  infinite 
sequence  generator.  How,  for  instance,  do  we  specify  the  semantics  of  a 
removable  cartridge  disk  drive?  What  is  read  off  the  file  may  in  no  way  be 
related  to  what  was  written  on  it. 

5.  Execution 

At  this  point  AM  is  essentially  complete.  At  our  disposal  are  all  the 
tools  we  need  to  build  programs  to  manipulate  data  any  way  we  want.  Missing, 
however,  is  a  means  for  executing  programs.  What  we  have  just  described  is  a 
fairly  typical  Von  Neumann  architecture  with  the  bounds  removed.  We  have 
noted  in  the  previous  chapter  how  difficult  it  is  to  express  the  passage  of  time. 
How  do  we  express  the  sequential  execution  of  a  program?  Our  solution  is 
derived  from  that  used  by  Fasel  (1980). 

Define  two  operators,  we  call  them  prog  and  xeq,  which  are  co- 
recursive.  The  semantics  of  execution  are  given  by 

prog:  memaddr.state  •  state; 

xeq:  instr. memaddr.state  —  state; 

prog(m,q)  =  xeq(atomofmstr(fetchm(m,q))  ,m,q); 

where  m  is  a  memory  address  (in  this  case  representing  the  value  of  the  program 
counter)  and  q  is  the  current  state  of  execution.  The  axiom  can  be  interpreted 
like  this; 

At  any  moment,  the  state  of  the  program  at  address  m  in  state  q  is 
equivalent  to  the  execution  of  the  instruction  stored  at  that  address. 


of  diagram  or  table  showing  the  characteristic  bit  patterns  of  the  opcode,  with 
"holes"  to  be  filled  in  by  the  operands  to  the  instruction.  The  operators  specified 
in  the  aminstructions  spec  correspond  precisely  to  these  diagrams.  They  accept 
zero  or  more  "operands"  and  return  an  atom  representing  the  aggregate  opcode. 
The  AM  assembler  uses  these  operators  to  construct  an  AM  object  program. 
Notice,  there  are  no  axioms  in  this  spec.  After  a  brief  study  the  reader  will  see 
that  these  instructions  are  similar  to  those  found  on  a  large  number  of  popular 
processors.  For  a  description  of  the  naming  conventions,  see  Appendix  C. 

The  only  thing  left  to  do  is  to  relate  the  objects  we  have  just  described  to 
val  so  that  they  can  be  stored  and  fetched  in  the  same  manner  as  the  atomic 
data  types.  To  do  this  we  again  invoke  the  newtype  macro. 

4.  State 

The  next  spec  forms  the  heart  of  AM.  It  describes  the  semantics  of  the 
the  physical  resource.  A  new  sort  is  introduced,  state,  which  at  any  moment 
represents  the  state  of  execution.  Every  operator  whose  result  depends  upon  the 
current  state  of  execution  must  accept  a  state  value  as  an  argument,  and  every 
operator  which  alters  the  state  must  return  a  state  value.  Thus  a  familiar 
pattern  develops  throughout  the  operators  in  this  spec.  For  example,  to  examine 
the  value  stored  in  a  register  or  memory  cell,  we  must  provide  not  only  the 
address,  but  also  the  current  state  of  the  machine.  The  state  is  not  altered. 
However,  when  a  value  is  stored,  a  new  state  must  be  returned. 

initam  returns  a  constant  representing  the  "initial  state"  of  the  machine. 
By  implication,  all  values  in  the  machine  in  state  initam  are  "undefined".  This 
is  an  error  condition,  specified  with  the  value  undef.  The  axioms  make  it 
impossible  to  fetch  from  a  cell  whose  value  is  undefined. 

Fetch  and  store  for  memory  and  registers  is  self-explanatory.  Worth 
noting  are  the  axioms  w'hich  relate  them  as  inverses. 

The  stack  operators  are  also  straightforward.  Notice  it  is  impossible  to 
alter  the  stack  pointer  (which  is  only  implied  in  the  operators)  except  by  pushing 
or  popping  a  value.  The  axioms  relate  the  operators,  and  make  it  an  error  to  pop 
or  access  the  top  of  an  empty  stack. 


Next  we  specify  the  operators  for  accessing  registers.  This  is  easier,  since 
we  may  not  perforin  address  "arithmetic"  on  register  addresses.  We  need  only 
have  a  way  for  obtaining  all  of  them,  given  the  first.  As  with  memaddresses.  we 
provide  an  operator  for  deciding  whether  or  not  two  register  addresses  are  equal. 
Notice  that  we  draw  a  distinction  between  addresses  and  integer  displacements, 
although  operators  like  offset  allow  mixed  use  of  these  and  others  sorts  in 
precisely  defined  ways. 

The  stack  is  a  little  more  interesting.  We  do  not  want  the  programmer 
to  have  access  to  the  "inside"  of  the  stack,  nor  do  we  want  to  provide  facilities 
for  altering  the  stackpointer.  We  therefore  provide  an  operation  for  returning  the 
stack  pointer,  and  for  determining  whether  or  not  two  stack  pointers  are  equal, 
but  no  more.  The  anticipated  push  and  pop  operations  cannot  be  defined  here 
for  the  same  reasons  we  have  not  defined  store  and  fetch  operations  —  we  have 
yet  to  define  all  the  objects  which  might  be  stored  or  fetched,  and  we  have  no 
concept  of  a  machine  state.  This  will  be  treated  shortly. 

The  spec  for  files  offers  the  same  "black  box"  abstraction  as  the  stack. 
We  want  to  give  the  programmer  access  only  through  a  carefully  designed  set  of 
as  yet  unspecified  primitives.  Therefore,  the  only  referencing  primitive  is  that  for 
obtaining  a  flies  address. 

AM's  intrinsic  operator  codes  are  next  defined  in  the  amoperators  spec. 
These  give  the  programmer  access  to  the  atomic  operators  provided  wuth  each 
data  type.  Each  such  atomic  operator  is  mapped  to  a  corresponding  operator  in 
amoperators  (its  machine  code).  We  introduce  a  new  sort  for  each  type  of 
operator  (monadic,  dyadic,  relational,  etc.)  and  the  operators  themselves.  A  set 
of  apply...  operators  are  also  specified.  These  will  accept  an  instrinsic  op  of  the 
appropriate  type,  plus  one  or  more  argument  values,  and  return  a  result.  They 
form  AM's  arithmetic  and  logic  unit  (ALU).  Also  defined  here  are  sets  of 
relational  ops  for  those  types  in  which  they  have  meaning.  These  will  provide 
the  programmer  with  the  primitives  for  conditional  branch  instructions. 

The  next  spec  defines  the  instruction  set  as  a  set  of  operators  which  all 
return  an  atom  of  the  sort  instr.  They  are  the  opcode  templates.  In  a  typical 
assembly  language  manual,  the  description  of  each  instruction  includes  some  sort 


-  It  is  the  responsibility  of  the  specification  writer  to  explicitly  define  the 
effects  of  errors  on  the  object  being  described. 

In  our  specification,  the  only  proper  way  to  handle  conversion  errors  is  to  provide 
a  set  of  axioms  defining  the  result  of  an  expression  containing  opposing 
atomof. ..  and  valof. ..  operators  whose  sorts  do  not  match.  This  is  handled  in 
the  error  spec. 

So,  the  atomic  types  are  introduced  as  machine  data  type.  Strings  are  a 
special  case,  since  we  must  first  instantiate  the  paramaterized  spec  for  strings  of 
characters,  and  then  relate  it  to  val.  This  is  done  with  spec  charstring  and  spec 
str.chartvpe.  Note  the  dot  notation,  similar  to  an  aggregate  structure  reference, 
used  to  denote  the  relation  of  the  chartype  spec  to  the  sort  str. 

3.  Machine  Primitives 

We  must  now  specify  an  abstraction  of  the  operations  of  the  machine 
itself.  We  need  to  be  able  to  reference  values,  specify  arithmetic  and  logical 
operations,  and  define  instruction  opcodes.  We  start  wdth  identifiers. 

The  concept  of  identifier,  as  we  use  it  here,  refers  to  the  name  of  an 
abstract  data  structure  composing  some  physical  resource,  such  as  a  memory 
segment,  a  stack,  or  a  file.  Identifiers  are  needed  to  allow  us  to  reference  these 
structures  as  complete  objects.  The  only  operation  we  need  is  a  comparison  for 
equality  for  each  type. 

We  then  write  specifications  for  each  of  the  types  of  addresses  we  will 
need,  one  for  each  AM  data  structure.  The  memaddresses  spec  defines  the 
operators  used  to  reference  values  in  primary  storage.  Given  the  identifier  of  a 
memory  segment,  the  base  address  is  returned  by  start memaddr.  Successive 
and  previous  addresses  may  be  obtained  using  nextraemaddr  and 
prevmemaddr  respectively.  Note,  there  is  no  previous  address  to 
startmemaddr.  This  condition  is  defined  as  an  error  in  the  axioms,  offset 
permits  arbitrary7  values  to  be  referenced  as  integer  d:  ~ements  from  another. 

Its  semantics  is  defined  recursively.  Note  how  the  memaddresses  spec  defines  an 
abstraction  which  exhibits  the  properties  we  required  for  our  machine  —  that  the 
next  (previous)  data  item  is  in  the  next,  (previous)  logical  address. 


Now  that  the  atomic  types  are  specified,  we  must  define  AM  s  basic 
element  of  storage,  the  value.  The  relationship  of  the  atomic  types  to  machine 
values  must  be  expressed  in  terms  of  value.  The  spec  is  trivial.  It  introduces  a 
single  sort,  val  and  an  error  op  typerr  to  express  the  condition  corresponding  to 
a  type  conversion  error.  Now,  examine  the  macro  newtype  at  the  top  of  the 
specification.  It  expands  a  statement  of  the  form 

newtype(sortname.specname): 

into  an  actual  spec  defining  a  new  data  type  to  AM  which  is  an  extension  of  the 
atom's  spec  and  value.  Within  this  spec  are  the  key  operators  and  axioms 
which  imply  AM's  tagged  architecture.  U>ing  integer  as  an  example,  valofint 
accepts  an  atom  of  type  int  and  returns  a  val.  atninofint  accepts  a  val  and 
returns  an  int  atom.  The  special  properties  of  the  operator  atoinofint  are 
expressed  in  the  axiom 

atomofint(valofint(x))  =  x: 

which  relates  atomof...  and  valof...  as  inverses.  Thus,  given  any  value  of  type 
val,  atomof...  will  extract  an  atom  of  the  appropriate  type. 

Here  we  must  deal  once  again  with  errors.  Operators  are  not  functions, 
and  their  arguments  are  not  parameters.  An  operator's  characteristic  determines 
the  types  of  objects  it  can  accept,  and  the  type  of  object  it  returns.  It  is  an 
abstract  object  which  defines  a  protocol  of  communication  with  respect  to  other 
abstract  objects  in  a  specific  way.  It  is  not  precisely  an  error  for  a  value  of  the 
wrong  sort  to  appear  as  an  argument  for  an  operator.  It  has  no  meaning  at  all. 
In  fact,  algebraic  specifications  provide  no  way  of  expressing  the  relationship  of 
other  objects  to  the  characteristic  of  an  operator.  This  is  one  of  the  stumbling 
blocks  of  the  methodology.  Goguen  (1978)  discusses  this  in  great  detail. 
Unfortunately,  in  the  real  objects  defined  by  the  abstraction,  there  may  come  a 
time  when  an  object  described  with  one  spec  appears  where  an  object  of  another 
type  is  expected.  Therefore,  we  avoid  a  rigorous  treatment  of  ■  errors  by 
substituting  for  a  theory  the  following  rules: 

-  If  any  value  violates  the  characteristic  of  an  operator,  that  operator  returns 
an  error  of  the  type  corresponding  to  its  return  type. 


Many  axioms  in  other  parts  of  the  specification  require  boolean  to  express  their 
meaning. 

Note  that  in  this  and  every  other  spec,  the  spec  name  is  distinct  from  the 
name  of  any  sort.  Any  similarities  in  them  are  purely  arbitrary.  The  name  given 
to  a  spec  denotes  an  abstract  object,  the  aggregate  of  sorts  and  operators  and 
axioms.  The  name  given  a  sort  is  an  index  into  a  set  of  carriers.  It  denotes  a 
specific  set  of  values  which,  together  with  operators,  forms  an  abstract  data  type. 
In  any  but  the  most  simple  specifications,  it  will  be  very  difficult  to  point  to  a 
single  thing  and  say  "This  is  the  data  type  so-and-so."  Throughout  this  thesis 
we  loosly  refer  to  "the  type  int"  or  "the  type  integer".  This  is  imprecise,  but  for 
lack  of  a  convenient  way  of  expressing  ourselves,  we  shall  continue  to  freely  mix 
these  terms.  The  reader  is  warned  to  examine  Chapter  2  again  if  this  point  is 
unclear. 

In  the  spec  for  boolean,  or  and  implies  are  specified  as  derived 
operators.  We  provide  them  for  convenience  only.  DeMorgan  s  axioms  may  be 
omitted  as  well. 

Natural  is  then  expressed  as  an  extension  of  boolean,  and  integer  as  an 
extension  of  natural.  A  typical  set  of  operators  is  provided.  We  do  not  specify 
multiplication  or  division,  although  using  conditional  axioms  this  is  not  too 
difficult.  Integer  extends  boolean  to  permit  conversions  to  be  specified.  AM 
allows  conversions  between  no  other  types.  Note  that  the  zero  values  of  natural 
and  integer  are  distinct,  as  are  all  other  members  of  their  respective  carriers. 

The  spec  for  character  defines  128  ASCII  codes.  The  symbol  for  each 
character  (each  a  0-ary  operator  returning  a  constant  value)  includes  the 
bracketing  single  quotes. 

String  is  expressed  as  a  parameterized  specification.  The  parameter 
template  must  be  matched  in  a  one-to-one  correspondance  by  some  other  spec 
before  a  string  type  may  be  instantiated.  Thus,  we  may  have  strings  of  anything, 
sc;  long  as  a  spec  exists  with  a  single  sort  and  two  operators  whose  semantics 
exactly  matches  axioms  in  the  parameter  template.  The  syntax  we  use  to  expres- 
the  mapping  of  sorts  to  sorts  and  operators  to  operators  is  awkward  but 
necessary  to  prevent  the  description  of  impossible  objects. 


than  slow  down  an  already  slow  interpreter  with  axiomatic  implementations  of 
the  operators. 

One  of  the  design  decisions  we  must  make  is  whether  to  pass  structures  or 
pointers  to  structures  throughout  the  program.  Pointers  are  faster  from  the 
standpoint  of  parameter  passing,  but  make  it  difficult  to  determine  when  to  free 
unwanted  values.  Passing  structures  is  safer,  because  a  new  copy  of  the  data  is 
made  within  each  function,  but  it  is  slow.  We  choose  to  be  slow,  but  safe  —  we 
pass  structures 

As  the  implementation  proceeds  to  more  and  more  complex  specifications, 
the  program  relies  less  and  less  upon  C  and  more  and  more  upon  the  bulk  of 
operators  which  we  have  defined.  In  fact,  the  more  complex  operators  are 
implemented  as  calls  to  previously  defined  functions  which  almost  directly  mimic 
the  axioms  from  which  they  are  derived.  We  will  illustrate  this  shortly. 

C.  ERROR  HANDLING 

All  errors  are  fatal,  but  they  need  not  be.  Those  errors  which  are  not  must 
be  defined  explicitly  in  the  specification.  As  we  have  said,  a  more  detailed 
treatment  of  errors  would  be  an  area  for  further  study. 

AM  flags  most  errors  in  the  operators  which  perform  data  conversions.  This 
is  a  natural  place  for  this  to  occur,  since  it  is  difficult  to  see  how  the  type  of  a 
data  element  may  be  changed  at  any  other  time.  Figure  5.5  lists  a  fragment 
I  which  implements  the  boolean  conversion  routines.  The  routine  error()  does  not 

return,  but  terminates  execution  after  writing  the  error  message  to  stderr.  Notice 
that,  even  if  a  much  larger  structure  was  passed  to  atomofboolQ  or  valofboolQ. 
the  error  would  be  detected  and  handled  gracefully. 

This  type  of  error  checking  is  also  performed  in  the  functions  which 
implement  data  operations. 

D.  EXECUTION 

The  final  point  of  interest,  involves  actually  executing  a  program.  The 
method  is  also  illustrative  of  the  way  in  which  the  program  mimics  the  axioms  of 
the  specification.  Here.  too.  we  resort,  to  subterfuge  to  implement  in  a  finite  way 
a  specification  which  could  require  the  expendature  of  an  infinite  resource  (an 
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^include  <setjmp.h> 
jmp_buf  _context: 
MEMADDR  cond(): 


main(argc.argv) 
char  *argv[j; 

{ 

int  ap; 


for  (ap=l;  ap  <  argc;  ap++)  { 
if  (* argv[ap]  ==  { 

if  (*(argv[apj  +  l)  ==  V) 
traceflag  =  1; 

} 

} 


initamQ: 

amload(); 


setjmp  (^context ) ; 

Q  =  prog(j)c,Q); 

exit(O); 

} 

STATE  prog(m,q) 

MEMADDR  m: 

STATE  q: 

{ 

q  =  xeq(atomofinstr(fetchm(m.q)).m,q); 

} 

STATE  xeq(i.m.q) 

LNSTR  i; 

MEMADDR  m; 

STATE  q; 

{ 

opnd  *p; 


if  (i.type  !=  INSTR_TYPE) 

error("attempt  to  oxocuto  non-instruction  -  r7x". i.type): 
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implied  stack  in  this  case).  The  problem  is  the  corecursive  relationship  between 
the  functions  xeq()  and  progQ.  We  eliminate  this  problem  by  never  actually 
returning  from  xeq().  We  rely  on  a  dangerous  but  effective  C  idiom.  setjmp() 
and  longjmpQ.  Figure  5.6  illustrates. 

In  main(),  initam()  configures  AM  and  invokes  all  of  the  initialization 
operators.  amloadQ  loads  a  program  from  secondary  storage  into  the  appropriate 
cells  as  directed  by  the  linker  directives  in  the  object  module.  SetjmpQ  then 
saves  the  state  of  the  "real"  machine.  The  variable  _pc  is  the  program  counter 
which  is  set  inside  amloadQ.  Now  everything  is  set.  The  program  is  loaded  and 
ready  to  run. 

progQ  is  now  called.  Notice  that  prog  simply  invokes  xeq().  Recall  now  the 
axiom  which  defines  the  semantics  of  execution. 


prog(m.q)  =  xeq(atomofinstr(fetchm(m.q))  .m.q); 
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case  MOV_M  _M: 

q  =  storem( 

fetchm( 

p[l].memaddrval, 

q 

). 

p[2].memaddrval. 

q 

); 

_pc  =  nextmemaddr(_pc); 
break 


Figure  5.7:  The  Semantics  for  movjn _m 


The  value  of  a  language  which  permits  usefully  long  and  descriptive  names  is 
obvious  in  this  case.  Within  xeq()  a  large  case  statement  decodes  the  instruction 
and  executes  it  according  to  the  semantics  provided  for  that  case.  This  semantics 
is  very  closely  modeled  on  the  axioms  in  the  specification.  Figure  5.7  lists  one 
such  case  and  its  accompanying  semantic  action.  Compare  it  to  the  axiom  for 
iiiovjnjii, 

xeq(mov_m_m(ml,m2),m,q)  = 
prog( 

nextmemaddr(m). 

storem( 

fetchm(ml,q). 

m2, 

q 

) 

); 

The  similarities  are  not  accidental.  This  should  make  the  point  that  it  is 
beneficial  for  the  implementation  language  to  permit  such  a  close  modeling  of  the 
specification.  Obviously,  this  made  the  implementation  easier  to  write,  easier  to 
debug  and  easier  to  understand. 
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E.  OBSERVATIONS 


AM  is  slow  (about  as  fast  as  the  average  Basic  interpreter).  But  we  have 
been  unable  to  make  it  fail  in  2  months  of  testing.  AM  refuses  to  do  anything 
which  has  not  been  expressly  defined  in  the  specification  from  which  it  is 
implemented.  This  is  encouraging. 

As  stated  earlier,  coding  went  extremely  quickly  (about  3000  lines  a  week). 
We  attribute  this  to  the  presence  of  the  specification,  which  was  a  template  for 
the  program,  and  C,  which  translates  nicely  from  our  specification  language.  We 
can  make  a  case  here  for  a  rule  which  would  require  that  the  specification 
language  be  syntactically  and  structurally  similar  to  the  implementation 
languages. 

The  next  step  would  be  to  implement  the  interpreter  in  microcode  on  a 
writable  control  store.  This  may  imply  a  change  in  the  specification  language 
syntax. 

We  designed  and  implemented  a  Von  Neumann  resource,  but  need  not  have 
done  this.  This  methodology  should  be  amenable  to  a  wide  variety  of 
architectures  and  implementations.  In  fact,  if  an  architecture  appears  to  be 
particularly  unsuited  to  formal  specification,  it  should  become  suspect.  We 
strongly  believe  that  because  the  methodology  suggests  a  tagged,  non-overlapping 
storage  organization,  this  tells  us  something  about  the  way  we  should  be 
designing  machines. 
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VI.  CONCLUSIONS 

We  have  noted  that  a  semantic  gap  exists  where  concepts  which  are 
primarily  resource  oriented  clash  with  those  which  are  primarily  problem 
oriented.  We  have  described  a  theory  upon  which  to  base  a  method  for  formally 
specifying  the  meaning  of  an  abstract  machine  and  the  resource  it  represents. 
We  have  shown  how  something  usefully  complex  can  be  described  with  this 
method,  and  that  it  can  be  successfully  implemented. 

So,  what  have  we  learned?  As  in  all  cases  where  physical  objects  and  their 
observable  properties  must  be  abstracted,  algebraic  specifications  describe  only 
fragments  of  the  physical  world.  The  writer  of  the  specification  is  faced  with  the 
difficult  task  of  eliding  unnecessary  detail  from  a  collection  of  facts  and 
assumptions  while  capturing  the  essential  semantics,  and  nothing  more.  This  is 
difficult  for  a  number  of  reasons: 

-  Designing  a  spec'fication  is  at  least  as  difficult  as  designing  a  programming 
language,  with  a  similar  set  of  issues  and  problems. 

-  The  writer  is  obligated  to  understand  and  abide  by  a  set  of  precise 
restrictions  imposed  by  the  theory  upon  which  the  specification  method  is 
based. 

-  There  are  no  developments  tools  to  support  this  methodology. 

-  The  problem  of  testing  and  proving  a  specification  correct  is.  as  yet. 
unresolved. 

-  No  method  has  been  developed  for  finding  the  differences,  if  any.  between 
the  semantics  actually  defined  by  a  specification,  and  those  intended  by  the 
writer. 

-  The  fact  that  any  implementation  can  be  only  a  finite  instantiation  of  a 
specification  poses  a  similar  set  of  problems  to  those  surrounding  the 
acceptance  of  language  and  hardware  standards. 

These  difficulties  not  withstanding,  we  cannot  avoid  the  rising  complexity  of 
hardware  and  software,  nor  can  we  ignore  the  ways  in  which  resource  dependence 
adversely  affects  software  portability.  We  have  explored  a  method  for  describing 
and  thinking  about  machines  in  a  rational  way.  which  permits  us  to  better 
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understand  the  relationship  between  software,  and  the  resource  upon  which  it  is 
implemented. 


A.  FUTURE  WORK 

Algebraic  specifications  provide  a  plausible  method  for  formally  describing  a 
physical  resource  abstraction  --  this  we  have  demonstrated.  We  suggest  the 
following  areas  for  continuing  research: 

-  Implement  a  specification  in  microcode,  using  a  writable  control  store. 

-  Port  the  abstract  machine  interpreter  to  a  number  of  different  physical 
resources. 

-  Implement  a  high  level  language  on  the  abstract  machine,  and  test  its 
portability  between  several  implementations  of  the  machine. 

-  Rationalize  the  treatment  of  errors  within  a  specification. 

-  Develop  an  abstraction  for  a  file  system  and  a  bit-mapped  display. 

-  Write  a  compiler  which  can  perform  syntactic  and  semantic  analyses  on  a 
specification,  determine  its  properties,  and  generate  a  test  suite  of  terms  to 
validate  it. 

-  Examine  a  variety  of  architectures-  as  to  their  describability  using  the 
algebraic  specification  methodology. 
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APPENDIX  A:  A  GRAMMAR  FOR  ALGEBRAIC  SPECIFICATION 


abstraction: 

(abstraction  spec)? 


spec: 

(spechead  |  parmhead)  specbody  specend 

spechead: 

nameblk  ’is’ 

parmhead: 

nameblk  'parm’  specbody  ’is’ 


specend: 

’end’  specname 

nameblk: 

’spec’  specname 

specbody: 

extension 
|  specblk 

extension: 

extendblk  specblk  ’end’  'extend’ 
extendblk: 

’extend’  specnames  ’with’ 

specnames: 

specname 

|  specnames  specname 


specblk: 

useblk 

|  sortblk?  opblk  axiomblk? 


useblk: 

’use’  specname  ’(’  specname  ’)’  mapping?  specblk  ’enduse’ 


mapping: 

’where’  eqivlist 


equivlist: 

equivalence 

|  equivlist  equivalence  V 
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equivalence: 

sortname  'is'  sortname 
|  opname  ’is"  opname 


sortblk: 

’sort’  sortnames 

sortnames: 

sortname 

|  sortnames  sortname 


opblk: 

primblk?  dervblk?  errblk? 


primblk: 

’primitive’  ’op’  ops 

ops: 

op 

|  ops  op 


op: 

opname  arglist?  '->’  sortname 

arglist: 

sortname 

|  arglist  sortname 
dervblk: 

’derived’  ’op’  ops 

erblk: 

error’  op'  ops 

axiomblk: 

‘axiom’  axioms 


axioms: 

axiom  ’:’ 

I  axioms  axiom  ’:’ 


axiom: 

(’for-  varlist  "in'  sortname  )?  termexpr  '  =  '  terinexpr 

termexpr: 

factor 


opname  '(’  factors  ') 


factors 


factor: 


varlist 


factor  • 

factors  factor 


opname 

|  freevar  * 


freevar 

varlist  freevar 


63 


APPENDIX  B:  THE  SPECIFICATION  FOR  AM 


'  amsper 

replace(X.S) 

"equivre!(X,S)n 

with 

"(or  i,j,k  in  S 
X(i,i)  =  true; 

X(i,j)  =  X(j.i); 

implies(and(X(i,j),X(j,k)),X(i,k))  -  true" 


replace(X,S) 

"reflexive(X.S)" 

with 

"for  i  in  S 

X(i,i)  =  true" 


replace(X.S) 

"commutative(X,S)" 

with 

"for  i,j  in  S 

X(i,j)  -  X(j,i)" 


replace(X.S) 

"transitive(X.S)" 

with 

"for  i.j.k  in  S 

implies(and(X(i.j),X(j>k)),X(i,k))  =  true" 


replace(X.S) 

"associative(X,S)" 

with 

"for  i,j,k  in  S 

X  ( i.  X  ( j,  k ) )  =  X(X(i,j),k) 


replare(X.S) 

"irreflexive(X,S)" 

with 

"for  i  in  S 

X(i.i)  =  false" 


replace(X,S) 

"symmetric(X.S)" 

with 

"for  i,j  in  S 

implies(X(i.j),X(j.i))  =  true" 


64 


replai  *m  \  ,S I 

"ant i->  mmn  ru  (  X  - )" 
with 

"for  i.j  in  S 

implies(andi  X(  i.jl.X  lj.il ).('  J ) )  true 


replai  e(S,T) 

"newtype(ST)" 

with 

"spec  Stype 
is 

extend 

T. 

value 

with 

primitive 

op 

atomofS:  val  S; 
valofS  S  —  val; 
error 
op 

Serr.  -•  S. 
axiom 

for  x  in  val 

atomofS(valofS(x))  =  x: 
atomofS(typerr)  =  Serr: 
end  extend, 
end  Stype" 


» 


» 


spec  boolean 
is 

sort 

bool; 

primitive 

op 

true:  —  bool: 
false:  bool; 

not:  bool  —  bool; 
and:  bool. bool  —  bool; 
derived 
op 

or  bool. bool  •  bool; 
implies:  bool. bool  —  bool; 
axiom 

false  ■  not(true|; 
not(not(b))  -  b; 
andltrue.b]  -  b: 
and(false.b)  false; 

not(and(bl  ,b2  | )  or(not(bl  ).not(bi)|; 
or(bl,bJ|  not  and( not(b  1  ).not ( b- ) ); 
not (or{ b  1  ,b2 ))  and(  notfbl  ),not (b2) ), 
orltrue.b)  true. 
or(false.b|  b: 
commutative!  and.  bool ) , 
rommutativ  e( or, bool): 
implies |  bl . bd )  not  ( arid(  b  1  .not  (b2 ) ) ): 

end  boolean. 


» 


> 


I 


I 
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spe<  natural 

is 

extend 

boolean 

with 

sort 

nat; 

primitive 

op 

prednat  nat  —  nat; 
succnat  nat  —  nat; 
sumnat  nat. nat  —  nat; 
zeronat.  —  nat; 
eqnat:  nat. nat  -  bool; 
gtnat;  nat. nat  — •  bool; 
axiom 

prednatl  zeronat )  =  zeronat; 
prednatl  suroiattn))  -  n; 
t  Qmmutative(sumnat.nat); 
assixiative(sumnat,nat); 
sumnat  (n. zeronat)  -  n: 

sumnatln.sut  rnat(m))  -  sur rnat (sumnat( n.m ) ); 
equivrell equal, nat ). 
ir  reflexive)  gtnat. nat). 
t  ransmvel  gt  nat, nat  I. 
gtnat('ui  rnat(n).n)  -  true, 
end  ex  i  end: 
end  natural. 
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c  integer 

extend 

boolean. 

nat 

with 

sort 

int, 

primitive 

op 

predint:  int  —  int: 
succtnt:  int  —  int; 
sumint:  int. int  —  int; 
zeroint;  int; 
eqinl  int, int  —  bool; 
gtint.  int. int  -  bool; 
ntoi  nat  —  int. 
iton  int  -  nat. 
error 
op 

nronverr  •  nat. 
axiom 

predint (succinl(n))  =  n; 
rommutative(sumint.int ); 
assoc iat  ive(summint.int); 
sumint(n. zeroint)  =  n; 

sumint(n.siKcintfm))  =  succint(sumint(n,m)); 

eqnivreK  eqint.int ); 

irreflexive(gtmt.int); 

transitive(gtint.int): 

gf  in  t  ( sure  int  ( n  ).n )  =  true; 

ntoi(  zeronat )  -  zeroitu; 

ntoi(succnat(n))  -  sumint(succint(zeroint).ntoi(n)); 
iton(zeroinl )  =  zeronat; 
i  t  on  ( succ  in  t  ( n ) )  = 

if  or  (gtint  (n.zeroint  ),eqint(n,zeroint ))- true 
then 

sumnat(  succ  nat  ( zeronat ) .  i  ton  ( n ) ) ; 

else 

nconverr; 

end  extend; 
nd  integer. 


am 


■xtend 

amstate, 
memaddrt>  pe. 
regaddrtype. 
stkaddrtype. 
instrtype 
with 
sort 

type; 

primitive 

op 

prog:  memaddr, state  —  state; 
hidden 
op 

xeq:  instr.memaddr.state  —  state: 
rond:  val. memaddr, memaddr  —  memaddr; 
whattype:  val  ->  type: 
typeundef:  type: 

typebool:  —  type; 
typechar:  —  type; 
typenat:  -  type; 
typeint:  —  type; 
typestring. char:  -»  type; 
typememaddr  —  type; 
typefile:  —  type; 
typeinstr  —  type; 
eqtype:  type, type  —  bool; 
isundef:  —  bop; 
isbool:  —  bop: 
ischar:  -  bop: 
tsnat:  -  bop: 
isirit:  —  bop; 
isstring. char:  -  bop: 
isinstr:  —  bop; 
ismemaddr:  -  bop; 
isfile  —  bop; 
axiom 

whattype  undef  =  typeundef: 

whattype  valofbool(b|  =  typebool: 

whattype  valofchar(r )  -  typechar: 

what  type  valofnat  ( n  |  “  typenat; 

whattype  valofint(i)  -  typeint: 

whattype  valofstring.rhar(s)  =  typestring  char: 

whattype  valofmemaddr(m)  -  typememaddr; 

whattype  valoffile(f)  -  typefile; 

whattype  \alofinstr(i)  =  typeinstr; 

replai  e(S) 

"isops(S);" 

with 

"applybop(isS.v)  = 

if  eqtype(whattype  v, typeS)  -  true  then 
valofbool  true: 

else 

valofbool  false, 
endif; " 
isopsl  bool). 
isops(char). 
isopsl  nat ); 
isopsl  int ): 
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norialloi  err: 

sloreml  v,  off  set  (n.st  art  memaddr;  lalior  ( n  1  .q) )  j.lfree(lalloc(  i.ql.q - 
-  staterr, 

offset)  n. offset  ( n  1  .start  memaddr!  lalloi  ( n2,q ) ) ) )  - 
if  orfgtmt (n.ntoi(n2) J.eqtnt (n.ntoi(nJ)] )  -  true 
then 

accesserr; 

else 

offset(sutruni(n.nl ), 
startmemaddr(  lalloc(n2,q) )); 

Lndirjzeronat.m)  =  m; 

indir(succnat(n),m)  =  atomofmemaddr(fetchm(indirf  n.m).q)); 
infile(f,openfile(s.f,wmode.x,q))  =  ioerr; 
infilej  f.initam )  =  ioerr: 
mftle(f.close(d.q))  =  ioerr; 
outfile(v,f,close(f,q))  =  staterr; 
outfile(v.  f.initam)  =  staterr; 
outfile(f.openfile(s.f,m,chardata.q))  =  staterr; 
outfile(v,f,openfile(s.f.rmode..x.q))  -  staterr; 
closefile(ropenfile(s,f,n.x.q))  =  q; 
openftle(s,f.n,openfile(s,f,m,x.q))  =  staterr; 
end  extend; 
end  amstate, 
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spec  amstate 


extend 


atmnstructions, 

identifiers 

with 


primitive 

op 

initam:  —  state; 

storer:  val.regaddr, state  ->  state; 
felchr;  regaddr.state  — »  val; 
storem;  val, memaddr, state  -*  state, 
fetchm:  memaddr, state  —  val; 
initstk:  stkaddr, state  —  state; 
topstk:  stkaddr, state  —  val; 
pushstk;  val, stkaddr. state  —  state; 
popstk:  stkaddr, state  —  state; 
lalloc:  nat. state  -  memid, 

Ifree:  memid, state  —  state; 

indir:  nat, memaddr  —  memaddr: 

infile:  file, state  —  val; 

outfile:  val, file, state  —  state; 

openfile:  str  char.file.ini. int. state  —  state; 

closefile;  file, state  —  state; 

rmode  -•  int; 

wmode:  —  int; 

rwmode:  —  int. 

openerr:  —  int, 

openok:  int; 

valdata:  -  int; 

chardata:  —  int: 

undef;  —  val; 


ioerr:  —  val; 
staterr  -  state; 
emptystkerr:  —  val; 
undflowstkerr:  —  state; 
nonallocerr.  -  val; 
arcesserr.  -  memaddr; 
iom 

implies]  eqmemaddrl  al  ,a2 )  ,fet  fhm(  a  1  ,storem(  v  .aJ.q) )  v| 


im pliesf  not  ( eqmemaddrl  m  1  ,m.’)),fetrhm(m  1 . storem ( v rn.'.q) )  -  felt  hm|m  1  ,q) ) 

-  true; 

fetchm! m.initam )  ~  undef; 
storem(fetrhm!m.q),m.q)  ■-  q. 

implies(eqregaddr(  rl  ,r  1  ),fetc  hr(r  1  .storer(v.r'i.q) )  -■  v) 

=  true: 

implies! not (eqregaddrfrl  ,r.  I  J.fet chr( r  1  .storer ( v.rJ.q))  -  fete hr(r2.q)) 

-  true; 

fetrhr(r.imtam)  -  undef, 
storer(fetchr(r.q).r.q)  -  q; 
topstk!  s.pushstk(\ .s.q)  |  \ . 

popstkfs, pushstk  ( v.s.q ) )  q; 
topstk!  s.mit  st  k(s)(  emptystkerr. 
popstk(s.initstk(s))  undflowstkerr; 
popstk  (s.mit  am  I  undlowstkerr 

fetr  hmj  offset  (  n.st  art  memaddr  (lalloc  |  n  1  q ) )  ).lfree(  lalloi  ( i  q )  q  ’ ) ) 
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push  m:  memaddr.stkaddr  —  instr: 

push  per:  int.stkaddr  —  instr: 

push  ri:  regaddr.stkaddr  —  instr; 

push  rid:  regaddr. int.stkaddr  —  instr; 

push  ridn;  regaddr.nat. int.stkaddr  —  instr; 

pushi:  vai.stkaddr  —  instr; 

pop  r:  stkaddr, regaddr  —  instr; 

pop  m:  stkaddr. memaddr  —  instr; 

popjper:  stkaddr, int  —  instr; 

pop  ri:  stkaddr, regaddr  —  instr; 

popj-id:  stkaddr, regaddr.int  —  instr; 

popjddn:  stkaddr. regaddr.nat, int  —  instr: 

popx:  stkaddr  —  instr; 

jmp  memaddr  —  instr; 

jmp  ri:  memaddr  —  instr; 

jmp_r:  regaddr  —  instr; 

bra:  int  —  instr; 

bra  r:  regaddr  —  instr; 

if:  relop, regaddr. regaddr, memaddr  -*  instr; 

ifr  relop, regaddr, val, memaddr  —  instr; 

ifte:  relop, regaddr. regaddr, memaddr, memaddr  -*  instr; 

iftei:  relop, regaddr, val, memaddr, memaddr  -*  instr; 

if  j>tr:  relop, regaddr, regaddr, int  —  instr; 

ifi_pcr:  relop, regaddr, val, int  -•  instr; 

ifte  per  relop, regaddr, regaddr, int, int  -•  instr; 

iftei  jicr:  relop, regaddr. val. int, int  -*  instr; 

test:  bop.regaddr.memaddr  —  instr; 

testm  bop, memaddr, memaddr  -»  instr; 

teste:  bop, regaddr, memaddr, memaddr  -*  instr; 

testme  bop, memaddr, memaddr, memaddr  —  instr, 

test_pcr:  bop, regaddr.int  -  instr: 

testm  _pcr:  bop. memaddr, int  -  instr; 

teste jper:  bop, regaddr.int, int  —  instr; 

testme_pcr:  bop, memaddr, int, int  -  instr, 

stop  —  instr; 

jsr:  memaddr.stkaddr  —  instr; 
jsr_ri:  memaddr.stkaddr  -  instr, 
jsr  j;  regaddr.stkaddr  —  instr; 
bsr:  int.stkaddr  -*  instr: 
bsr  r:  regaddr.stkaddr  —  instr. 
rts:  stkaddr  —  instr; 
link  regaddr.nat  —  instr; 
unlink  regaddr  -*  instr; 
open:  stkaddr  -»  instr: 
close:  stkaddr  —  instr; 
read:  stkaddr  —  instr: 
write:  stkaddr  —  instr; 
org:  —  instr; 
extern:  —  instr; 
globl.  —  instr: 
mbegin:  -•  instr; 
mend:  ->  instr; 
end  extend; 
end  aminstructions: 

newtype(memaddr.memaddresses); 
new  type  (regad  dr.regaddresses): 
newtype(stkaddr.stk  addresses); 
newtypeffile, files); 
new' type  (instr.  am  instruct  10  ns); 


spee  aminst  Tuitions 


extend 

amoperators. 

memaddresses. 

regaddresses, 

stkaddresses 

with 

sort 

instr; 

primitive 

op 

dyads:  dop, regaddr, regaddr  —  instr; 

dyadsi:  dop, val, regaddr  —  instr; 

dyad:  dop,regaddr,regaddr,regaddr  —  instr; 

dyadi.  dop, val.regaddr.regad dr  — *  instr; 

monads:  mop.regaddr  -•  instr; 

monad:  mop,regaddr.regaddr  —  instr. 

monadi:  mop.val.regaddr  -  instr: 

offst:  int  egaddr  —  instr. 

mov  m^n:  memaddr.memaddr  —  instr; 

movjpcr_pcr:  int. int  —  instr, 

mov  ri  m:  regaddr.memaddr  —  instr; 

mov_ri_pcr:  regaddr.int  —  instr; 

mov  rid  m:  regaddr.int.memaddr  —  instr; 

mov_rid  per:  regaddr.int, int  —  instr; 

mov  ridn  m:  regaddr.nat.int .memaddr  -  instr; 

mov j-idn_pcr:  regaddr.nat.int. int  —  instr: 

mov  jn  ri  memaddr. regaddr  -  instr; 

mov_pcr  ri:  int. regaddr  -  instr; 

mov  m  nd:  memaddr. regaddr. int  -  instr: 

mov  jxr  rid:  int. regaddr. int  —  instr: 

mov  m  ridn:  memaddr. regaddr.nat.int  -  instr. 

mov  j>cr_ridn  int. regaddr.nat.int  -  instr. 

mov  ri  ri:  regaddr. regaddr  -  instr: 

mov  rid  Ji.  regaddr, int. regaddr  —  instr: 

mov_r idn  ri:  regaddr.nat.int, regaddr  -  instr; 

mov  ri  rid  regaddr, regaddr. int  —  instr: 

mov  ri  ridn  regaddr, regaddr.nat.int  —  instr; 

mov  rid  rid  regaddr.int. regaddr. int  —  instr: 

mov  ridn  rid  regaddr.nat.int. regaddr.int  —  instr: 

mov  rid  ridn.  regaddr.int.regaddr.nat.ini  —  instr: 

rnov  ridn  ridn.  regaddr.nat.int. regaddr.int, int  -*  instr; 

movi  in  'al. memaddr  *  instr: 

mov i  per  val.int  -  instr: 

movi  ri:  val. regaddr  •  instr. 

movi  rid  val  regaddr  mt  •  instr; 

movi  ridn  val. regaddr.nat.int  •  instr; 

movi  r:  val. regaddr  -  instr. 

mov  r  r  regaddr, regaddr  -•  instr; 

mov  m  r  memaddr. regaddr  *  instr; 

mov  per  r  int  regaddr  -  instr 

mov  ri  r  regaddr. regaddr  -  instr: 

mov  rid  r  regaddr. int. regaddr  -  instr; 

mov  ridn  r  regaddr.nat.mt.regaddr  —  instr: 

mov  r  m  regaddr.memaddr  -  instr: 

mov  r  per:  regaddr  int  -  instr; 

mov  r  ri  regaddr. regaddr  •  instr; 

mov  r  rid.  regaddr. regaddr. int  *  instr; 

mov  r  ridn  regaddr. regaddr. nat  int  •  instr: 

push  r:  regaddr, st kaddr  *  instr. 
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\  alofS(gtS(atomofs  vl.atomofS  v2))" 
monadic(boolnot  not  bool); 
d\ adicfhooland.  id. bool), 
dyadic)  boo  lor.  or,  bool); 
dyadic  (natsum.sumnat  ,nat ); 
dvadic(imsum,sumint,int ); 
dyadic(charstrlen.lenstr.char.str.char); 
dy  adic  (chare oncat.c at str.char.str.char); 
relationalops(nat): 
relationalops(int); 
relationalops(char); 
relationalops(str.char); 
end  extend; 
end  amoperators; 
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spec  amoperators 
is 

extend 

booltype, 

nattype, 

inttype, 

chartype, 

str.chartype 

with 

sort 

mop; 

dop; 

relop; 

bop; 

primitive 

op 

applymop;  mop.val  —  val; 
applydop  dop. val, val  — »  val; 
applyrel:  relop, val. val  —  val; 
applybop:  bop. val  — *  val; 
boolnot:  —  mop; 
booland:  -*  dop; 
boolor:  —  dop; 
natsum;  —  dop; 
intsum:  —  dop; 
charstrlen:  —  mop; 
charconcat.  —  dop; 
charmakestr:  —  mop; 
charheadstr:  —  mop; 
chartailstr:  mop; 

replace(S) 

"relationalops(S)” 

with 

"Sgt:  —  relop; 

Seq:  —  relop" 

error 

op 

operr  —  val: 
moperr:  -*  mop; 
dopern  —  dop; 
reloperr:  -*  relop; 
axiom 

applymop(m.typerr)  =  operr; 
apply dop(d.v.typerr)  -  operr; 
apply  dop(d.typerr.v)  =  operr, 
replace(M.O,S) 

"monadic  (M,0,S)" 
with 

"applymop(.M.v)  = 

valofS(0(atomovS  v))" 
replace(D.O.S) 

"dyadic(D.O.S)" 

with 

"applydop(D,v  1  ,v2)  - 

valofS(0(atomofS  vl.atomofS  v2))" 
replace(S) 

"relations(S)" 

with 

"applyrel(Seq,vl,v2)  - 

valofS(eqS(atomofS  vl.atomofS  v2)): 
applyrel(Sgt,vl.v2)  = 


spec  regaddresses 
is 

extend 

identifiers, 

boolean 

with 

sort 

regaddr; 

primitive 

op 

startiegaddr:  regid  -»  regaddr; 
nextregaddr:  regaddr  —  regaddr; 
eqregaddr:  regaddr, regaddr  — *  bool; 
axiom 

equivrel(  eqregaddr,  regaddr); 

eqregaddr(startregaddr(il  ),startregaddr(i2))  =  eqregid(ii,i2); 
eqregaddr(startregaddr(i),nextregaddr(a))  =  false; 
eqregaddr(nextregaddr(al  ),nextregaddr(a2))  =  eqregaddr(al,a2); 
end  extend; 
end  regaddresses; 


spec  stkaddresses 
is 

extend 

identifiers, 

boolean 

with 

sort 

stkaddr, 

primitive 

op 

eqstkaddn  stkaddr, stkaddr  ->  bool; 
stkpointer:  stkid  —  stkaddr 
axiom 

equivrel(eqstkaddr,  stkaddr); 

eqstkaddr(stkpointer(il),stkpointer(i2))  =  eqstkidfil  ,i2); 
end  extend; 
end  stkaddresses; 


spec  files 
is 

extend 

identifiers, 

boolean 

with 

sort 

file; 

primitive 

op 

getfile:  fid  —  file: 
eqfile:  file, file  -»  bool; 
axiom 

equivrelf  eqfile.  file); 

eqfile(getfile(il  ),getfile(i2 ) )  =  eqfid(il,i2); 
end  extend; 
end  devaddresses; 
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spec  identifiers 
is 

sort 

memid; 

regid; 

stkid; 

fid; 

primitive 

op 

mem.  memid; 
reg:  —  regid; 
stk:  —  stkid; 

eqmemid:  memid, memid  —  bool; 
eqregid:  regid. regid  bool; 

eqstkid:  stkid, stkid  —  bool; 
eqfid:  fid, fid  —  bool; 
axiom 

equivrel(  eqmemid, memid); 
equivrel(  eqregid.  regid); 
equivrel(eqstkid.stkid); 
equivrel(eqfid,fid); 
end  identifiers; 


spec  memaddresses 
is 

extend 

identifiers, 

boolean 

with 

sort 

memaddr; 

primitive 

op 

startmemaddr:  memid  -  memaddr; 
nextmemaddr.  memaddr  —  memaddr; 
preamemaddr:  memaddr  -*  memaddr; 
eqmemaddr:  memaddr, memaddr  -*  bool; 
getmemid:  memaddr  —  memid; 
offset:  int, memaddr  —  memaddr; 
error 
op 

memaddrerr:  —  memaddr; 
axiom 

equivrelfeqmemaddr.  memaddr ). 
preamemaddr(nextmemaddr(  m ) )  =  m; 
preamemaddrfstartmemaddr(i) )  -  memaddrerr 

eqmemaddr)  start  memaddr)  il  ),startmemaddr)i2))  =  eqmemidjil  ,i2); 
eqmemaddr  ( start  memaddr  (i).  next  memaddr)  a))  =  false; 
eqmemaddr) nextmemaddr) al  ),next memaddr) a2))  =  eqmemaddr(a)  ,a2); 
offset(zeroint.m)  =  m; 

offset  (succint(n).m)  =  nextmemaddrfoffs  jn.m)); 
o ffset) predin t(n),m)  -  preamemaddr(offset(n,m)); 
eqmemid)  i.  get  memid(offset(n,  start  memaddr(i))); 
end  extend; 
end  memaddresses; 
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spec  value 
is 

sort 

val: 

error 

op 

typerr:  —  val; 
end  value; 

new  type  (bool, boolean); 
newtype(int,integer); 
new  type(nat,nat  oral); 
new  type)  char,  character); 

spec  charstring 
is 

extend 

chartype 

with 

use 

string(character) 

where 

char  is  lm; 
eqchar  is  eqlm; 
gtchar  is  gtlm; 
end  extend: 
end  charstring; 


«pec  str  chartype 
is 

extend 

charstring 

with 

primitive 

op 

atomofstr.char:  val  —  str. char; 
valofstr.char:  str. char  ->  val; 
error 
op 

str.charerr:  —  str  chair; 
axiom 

for  x  in  val 

atomofstr.char(valofstr.char(x))  =  x; 
atomofstr.char(typerr)  ~  str.charerr, 
end  extend; 
end  str  chartype; 


spec  string 
parm 

extend 

boolean 

with 

sort 

lm; 

primitive 

op 

eqlm:  lm.lm  —  bool; 
gtlm:  lm,lm  — *  bool; 
axiom 

equivrel(eqlm,im); 
irreflexive(gt  lm.lm); 
transit  ive(gt  lm.lm); 
end  extend; 
is 

extend 

natural, 

boolean 

with 

sort 

str; 

primitive 

op 

nullstr:  —  str; 
makestr:  lm  -»  str; 
catstr;  str, str  —  str; 
lenstr:  str  -•  nat; 
headstr:  str  -•  lm; 
tailstr:  str  —  str; 
eqstr;  str, str  -•  bool; 
gtstr:  str, str  —  bool; 
axiom 

lenstr(nullstr)  =  zeronat; 
lenstr(makestr(l))  =  succnat(zeronat); 

Ienstr(catstr(sl,s2))  =  sumnat(lenstr(sl),lenstr(s2)); 
headstr(makestr(l))  =  I; 
tailstr(makestr(l))  =  nullstr; 
headstr(catstr(makestr(l),s))  =  1; 
tailstr(catstr(makestr(l),s2))  =  s2; 
headstr(nullstr)  =  strerr; 
tailstr(nullstr)  =  nullstr; 

catstr(catstr(sl,s2),s3)  =  catstr(sl,catstr(s2,s3)); 

catstr(nullstr.s)  =  catstrfs, nullstr)  =  s; 

equivrel(eqstr.str); 

irreflexive(gtstr.str); 

transitive(gtstr.str); 

implies(eqlm(U,12),eqstr(makestr(ll),makestr(12)))  =  true; 
implies(gtlm(U,12),gtstr(makestr(ll  ),makestr(12)))  =  true; 
gtnat(lenstr(makestr(l)),lenstr(nullstr))  =  true; 
implies(gtnat(lenstr(sl),lenstr(s2)),gtstr(sl,s2))  =  true; 
if  not  eqstr(lenstr(sl),zeronat)  then 

gtnat(lenstr(catstr(sl,s2).lenstr(s2))  =  true; 

else 

eqnat(lenstr(catstr(sl.s2),lenstr(s2))  =  true; 
end  if; 
end  extend; 
end  string; 


gtchar(ENQ.EOT) 
gtchar(EOT,ETX) 
gtchar(ETX.STX) 
gtchar(STX.SOH) 
gtchar(SOH,NUL) 
end  extend; 
end  character; 


true; 

true; 

true; 

true; 

true; 
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gtcharf 

..a 

true; 

gtcharl 

. 

•)  ■ 

true; 

gir  har( 

') 

true; 

gti  har( 

"  ’I 

true; 

gtcharl 

'  )  - 

true; 

gtt  har( 

')  = 

true; 

gtcharl 

’)  - 

true: 

gtchar( 

M 

true; 

gtcharl 

z 

....  'A 

')  -  true 

gtcharl 

A 

-  true; 

gtcharl 

o 

-  true; 

gtcharl 

■>  ') 

=  true; 

gtcharl 

> 

.'=•) 

=  true; 

gtcharl 

- 

-  true; 

gtcharl 

■V) 

-  true. 

gtcharf 

■•••)  = 

true; 

gtcharl 

' 

•9-)  = 

true; 

gtcharl 

....  0 

')  =  true; 

gtcharl 

0 

) 

-  true; 

gtcharf 

true; 

gtcharf 

true; 

gtcharf 

'.•)  = 

true; 

gtcharf 

•-’) 

=  true; 

gtcharf 

- 

=  true; 

gtcharf 

* 

=  true; 

gtcharf 

)' 

=  true; 

gtcharf 

c 

=  true; 

gtcharf 

,•&  ') 

=  true; 

gtcharf 

& 

|  =•  true; 

gtcharl 

=  true; 

gtcharl 

s 

-  *  ') 

=  true; 

gtchar(  '*  ')  =  true; 

gtchar(  =  true; 

gtchar( '!  \SP)  =  true; 
gtchar(SP.US)  =  true; 
gtchar(US,RS)  =  true: 
gtchar(RS,GS)  =  true; 
gtchar(GS,FS)  =  true; 
gtchar(FS.ESC)  =  true; 
gtcharl  ESC, SUB)  =  true; 
gtchar(SL'B.EM)  =  true 
gtcharf  EM. CAN)  -  true: 
gtchar(CAN.ETB)  true. 
gtchar(ETB.SYN)  -  true. 
gtrhar(S  Y  N..N  A  K )  true: 
gtcharf  N  AK.DCt )  -  true, 
gtcharl  DC-1.D03)  -  true. 
gtchar(  DCXDCI)  =  true: 
gtcharf  DC?, DO )  =■  true: 
gtcharl  DCl  .DLE)  -  true. 
gtchar(DLE,S!)  -  true; 
gtchar(SI.SO)  =  true; 
gtrhar(  SO.CR )  true. 
gtchar(CR.FF)  -  true; 
gtchar(FF.VT)  true; 
gtcharl VT.LF)  —  true; 
gtcharl  LF.HT)  =  true; 
gtchar(HT.BS)  --  true; 
gtcharl  BS,  BEL)  =  true; 
gtcharl  BEL.  ACK)  =  true; 
gtcharl  ACK.ENQ)  -  true: 
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spec  character 

is 

extend 

boolean 

with 

sort 

char; 

primitive 

op 

'A  VB 

,  'C  'Z  ':  —  char; 

'a'.  1)', 

'c 'z':  -  char; 

r,-@\ 

,  w  , 

'  —  char; 

’I  '2 

•3  ','4  ‘5  '6 ",  '7 '8  ',  '9  '0 

NUL:  - 

-  char: 

SOH:  —  char. 

STX:  — •  char, 

ETX:  -  char; 

EOT ;  —  char: 

ENQ:  -  char; 

ACK: 

—  char; 

BEL:  —  char; 

BS:  - 

char; 

HT:  - 

char; 

LF;  - 

char; 

VT:  - 

char; 

FF:  - 

char; 

CR:  - 

char; 

SO:  - 

char; 

SI:  -»  char: 

DLE:  -  char; 

DC1:  —  char: 

DC2:  -  char; 

DC3:  —  char; 

DC4:  -*  char; 

NAK: 

—  char; 

SYN:  -  char; 

ETB:  —  char. 

CAN: 

—  char; 

EM:  - 

char; 

SUB:  —  char; 

ESC:  —  char; 

FS:  - 

char; 

GS:  - 

char; 

RS:  - 

char; 

US:  - 

char: 

SP:  - 

char; 

DEL: 

-  char; 

pqrhar 

char, char  —  bool; 

gtchar 

char. char  —  bool; 

axiom 

equivrel(eqchar,char): 

irreflexive(gtchar.rhar); 

transitive(gtchar.char); 

gtchari 

DEL',"  ')  -  true; 

gtrhari 

'■  true; 

gtchar 

}  . '  ')  -=  true; 

gtchar. 

', '{ ')  -  true; 

gtchar 

*{ z  ')  -•  true; 

I 

m 


Iff 


isops(string.char): 

isops(memaddr): 

isops(instr): 

isops(file); 

equivrel(eqtype.type); 

implies) 

eqtype(w  hattypefv  1  ).whattype(v2))  =  false, 
applyrel)  inteq.v  1  ,v2)  =  valofbooljfalse) 

)  -  true; 
implies) 

eqtype(whaLtype(vl),whattype(v2))  =  false, 
applyrel(nateq,vl,v2)  =  valofbooljfalse) 

)  =  true: 
implies) 

eqtype(whattype(vl  ),whattype(v2))  =  false, 
applyrel(chareq,vl.v2)  =  valofbool(false) 

)  =  true; 
implies) 

eqtype(whattype(vl).whattype(v2))  =  false, 
applyrel(string.chareq,vl,v2)  =  valofbool(false) 
)  =  true; 

cond(valofbool(true).al,a2)  =  al; 
cond(valofbool(false),al  ,a2)  =  a2; 

prog(a,q)  =  xeq(atomofmstrjfetchm(a,q).a,q)); 

xeq(dyads(o.rl  .r2),m,q)  = 
prog) 

nextmemaddr(m), 

storer) 

applvdop) 

o, 

fetchr(rl.q), 

fetchr(r2,q) 

). 

r2, 

q 


xeq(dyadsi(o,v.rl  ).m.q)  — 

prog) 

nextmemaddr(m), 

storer) 

applydop) 

v 

V  , 

fetrhr(rl,q) 


ri. 

q 


xeq(dyad(o.rI  ,r2.r3),m.q)  = 
prog) 

nextmemaddr(m), 
s t  orer) 

applydop) 

o, 

fetrhrjrl  ,q), 
fetrhr(r2,q) 

)■ 
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r3. 

q 


); 

xeq(dyadi(o.v,rl.r2),m,q)  = 

prog) 

nextmemaddr(m), 

storer( 

applvdop) 

o, 

V, 

felch_r(rl,q) 

)• 

r2, 

q 

) 

); 

xeq(monads(o,rl  ).m.q)  = 
prog( 

nextmemaddr(m). 

storer( 

applymop( 

o, 

fetchr(rl  ,q) 


rl, 

q 

) 

); 

xeq(monad(o,rl,r2),m,q)  - 

pr°g( 

nextmemaddr(m). 

storer( 

applymop) 

o, 

fetchr(rl,q) 


r2, 

q 

) 

); 

xeq(monadi(o,v,rl  ).m,q)  - 
prog( 

nextmemaddr(m), 

storer( 

applymop(o,v), 

rl, 

q 


); 

xeq)offst(i,r).m.q)  = 
prog( 

nextmemaddr(m), 

storer( 

valo(rnemaddr( 

offset) 

i. 

atomofmemaddr) 

fetchr(r,q) 

! 

) 
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xeqfmovjn  _m(ml,m2),m.q)  = 

pr°g( 

nextmemaddr(m), 

storemf 

fetchm(ml.q), 


xeq(mov  _pcr_pcr(il,i2),m.q)  = 

pr°g( 

nextmemaddr(m), 

storem( 

fetchm( 

offset(il,m), 


offset(i2,m), 

q 


xeq(mov  ri jn(r,ml ).m.q)  = 
Pr°g( 

nextmemaddr(m), 

storem( 

fetchm) 

atomofmemaddr) 
f etchr(r,q ) 


xeqfmov^ijicrfr.ij.m.q)  = 
Pr°g( 

nextmemaddr(m), 

storem( 

fetchm( 

atomofmemaddrf 

fetchr(r.q) 

), 

q 

). 

offset(i.m). 


xeq(mov  rid  m(r,i,m  1  ),m.q)  = 
pr°g( 

nextmemaddr(m), 

storem( 

fptrhm( 

offset) 
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atomofmemaddr) 

feichr(r.q) 

) 

)■ 

q 


ml , 

q 


); 

xeqjmov  rid_pcr(r,il ,i2),m,q)  = 

prog( 

nextmemaddr(m), 

storem) 

fetchm) 

offset) 

il, 

atomofmemaddr) 

fetchr(r,q) 

) 

). 

q 

). 

offset(i2,m), 

q 

) 

); 

xeq(mov_ridn_m(r,n,i,ml  ),m,q)  = 

pr°g( 

nextmemaddr(m), 

storem) 

fetchm) 

offset) 

i, 

indir) 

n, 

atomofmemaddr) 

fetchr(r.q) 

) 

) 

)• 

q 

). 

q 

) 

); 

xeqfmov  ridn  pcrfr.n.il  ,i2).m.q)  = 
prog) 

nextmemaddrjm ), 
storem) 

(etchm) 

offset) 

il, 

mdir) 

n, 

atomofmemaddr) 

fetchr(r.q) 


» 


» 


» 


)•  • 

q 
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I 


). 

offset  ( i2  ,m ) . 

q 

) 

): 

xeq(mov_m_n(ml.r),m,q)  = 
prog( 

nextmemaddr(m), 

storem( 

fetchm(ml,q), 

atomofmemacdr(fetchr(r,q)) 

.q 

) 

): 

xeqfmov  _pcr_ri(i,r),m,q)  = 

Pr°g( 

nextmemaddr(m), 

storem( 

fetchm( 

offset(i.m), 

q 

). 

atomofmemaddr( 

fetchr(r.q) 

). 

q 

) 

); 

xeq(mov_m_rid(ml,r,n),m,q)  = 

Pr°g( 

nextmemaddr(m), 

storem( 

fetchm(nil,q), 
offset  ( 
n, 

atomofmemaddr( 

fetchr(r,q) 

) 

). 

q 

) 

). 

xeq(mov_pcr_rid(il.r,i2),m,q)  = 

Pr°g( 

nextmemaddr(m), 

storemf 

fetchm( 

offset(il,m), 

q 

), 

offset  ( 

\2. 

atomofmemaddr) 

fetthrfr.qj 

) 

)• 

q 

) 

)■ 

xeq(mov  m  ridn(m  1  ,r,il ,i2),m,q)  - 

prog( 


nextmemaddr(m), 

storem) 

fetchm(ml,q), 

offset) 

i2. 

indir) 

il, 

atomofmemaddr) 

fetchr(r,q) 

) 

) 

)• 

q 

) 

); 

xeq(mov  j>crj-idjn(il,r,n,i2),m,q)  = 
prog( 

nextmemaddr(m), 

storem) 

fetchm) 

offset(il,m), 

q 

), 

offset) 

i2, 

indir) 

n, 

atomofmemaddr) 

fetchr(r,q) 

) 

) 

). 

q 

) 

); 

xeq(mov  jd_ri(rl.r2),m,q)  = 
prog) 

nextmemaddr(m), 

storem) 

fetchm) 

atomofmemaddr) 

fetchr(rl,q) 

)■ 

q 

). 

atomofmemaddr) 

fetchr(r2,q) 

). 

q 

) 

). 

xeq(mov  rid  ji(rl  .i.r2),m,q)  = 

prog) 

nextmemaddr(m), 

storem) 

fctchm) 

offset) 

i. 

atomofmemaddr) 
fetchrfrl  .q) 


q 

). 

atomofmemaddrj 

fetchr(r2,q) 


) 

J; 

xeq(mov_ridn_ri(rl,il,i21r2),m,q)  =■ 

Pro?( 

nextmemaddr(m), 
storemj 
fetchmj 
offset  ( 
i  2 , 

indir( 

il, 

atomofmemaddr( 

fetchr(rl.q) 


), 

q 

), 

atomofmemaddrj 

fetchr(r2,q) 

). 

q 

) 


xeq(  mov  j-j  j-id ( r  1  ,r2,n),m,q)  = 

prog( 

nextmemaddr(m), 

storem( 

fetchm( 

atomofmemaddr( 

fetcHr(rl,q) 

)■ 

q 

). 

offsetj 

n, 

atomofmemaddr( 

fetchr(r2.q) 

) 

), 

q 


xeqjmov  ri  ridn(rl.r2,il.i2),m.q)  = 
prog( 

nextmemaddrjm). 

storemj 

fetchmj 

atomofmemaddrj 

fetchrjrl.q) 

)• 

q 

)■ 

offsetj 
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indirj 

il. 

atomofmemaddrf 

fetchr(r2,q) 

) 


), 

q 

) 

); 

xeqfmov  _rid_rid(rl.il,r2,i2),m,q)  = 
Pr°g( 

nextmemaddr(m), 

storem( 

fetchm( 
offset  ( 
il. 

atomofmemaddr( 

fetchr(rl.q) 


), 

q 

). 

offset  ( 
i2, 

atomofmemaddrf 

fetchr(r2,q) 

) 

). 

q 


); 

xeq(mov_ridn_rid(rl,il,i2,r2,i3).m,q)  = 
Pr°g( 

nextmemaddr(m), 
storem( 
fetchmf 
offset  ( 
i2, 

indir  ( 
il, 

atomofmemaddrf 
fetchrfrl  ,q) 


). 

q 

). 

offset) 

i3, 

atomofmemaddr( 

fetchr(r2.q) 

) 

). 

q 

) 

); 

xeqfmov  rid  ndn(r  1  ,i  1  ,r2,i2.i3),m,q)  = 
progf 

nextmemaddr(m). 
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» 


» 


I 


» 


storemf 

fetchm( 

offsetf 

11, 

atomofmemaddr! 
fete  hr  (r  1  ,q) 

) 

). 

q 

), 

offset! 

i3, 

mdir( 

12, 

atomofxnemaddr( 

fetchr(r2,q) 

) 

) 

), 

q 

) 

); 

xeq(mov_ridn_ridn(rl,il  ,i2,r2,i3.i4),m,q) 

pro,;! 

nextmemaddr(m), 

storem( 

fetchm{ 

offset! 

i2, 

indir( 

il, 

atomofmemaddr! 
fetchr(rl  ,q) 

) 

) 

). 

q 

). 

offset! 
i4 , 

Lndir  ( 

S3, 

atomofmemaddr! 

fetchr(r2,q) 

) 

) 

). 

q 

) 

); 

xeqfmovi  jnfv.ml  ),m,q)  = 
prog( 

nextmemaddr(m), 
storem(v,ml  ,q) 

); 

xeqfmovi  j>cr(v,i).m,q)  = 
prog( 

nextmemaddr(m), 
stc  rem( 
v, 

offset(i,m). 


q 


xeq(movi_ri(v.r).m.q)  = 

prog) 

nextmemaddr(m). 

storem) 

v, 

atomofmemaddr) 

fetchr(r.q) 

). 

q 


); 

xeq(movi_rid(v,r.n),m,q)  = 
prog) 

nextmemaddr(m), 

storem( 

v. 

offset) 

n, 

atomofmemaddr) 

(etchr(r.q) 

) 

), 


); 

xeq(movi_ridn(v,r,il  ,i2),m.q)  = 
prog( 

nextmemaddr(m). 

storem) 

v, 

offset) 

i2, 

indir) 

il, 

atomofmemaddr) 

felchr(r,q) 


\ 


q 

) 

): 

xeqfmovi  r(v,r),m.q)  — 

prog(nextmemaddr(m).storer(v,r.q) ); 
xeq(mov  r_r(rl.r2),m.qj  = 

prog)  nextmemaddr(m),storer(  fetch)  rl,q),r2,q)); 
xeqjmov  _m_r(  ml  ,r),m.q)  = 

prog(nextmemaddr(m),storer(fetchm(ml,q),r,q)); 

xeq(mov_pcr_r(i.r).m.q)  - 
prog) 

nextmemaddr(m), 

storer) 

fetchm(offset(i,m),q), 

r, 

q 

) 

)•• 

xeq) mov  j-i  r(r  1  .r2).m.q)  ' 
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prog( 

nextmemaddr(m), 

storer( 

fetchm( 

atomofmemaddr) 

fetchr(rl.q) 


). 

r2 

q 


); 

xeq(mov  j-id_r(rl.n.r2),m,q)  = 
Pr°g( 

nextmemaddr(m), 

storer( 

(etchm) 

offset) 

n, 

atomofmemaddr) 
fetchr(rl  ,q) 


!, 

r2, 

Q 


); 

xeq(mov_ridn  r(rl,il,i2,r2),m,q)  = 

pr°g( 

nextmemaddr(m), 

storer) 

fetchm) 

offset) 

i2, 

indir) 

il, 

atomofmemaddr) 

fetchr(rl.q) 


) 


q 


r2, 

q 


xeqjmov  r  m(r,ml),m,q)  = 

prog(nextmemaddr(m],storem(fetchr(r.q),m  1  ,q ) ) ; 
xeq(mov_r_pcr(r.i),m.q)  = 
prog) 

nextmemaddr(m), 

storem) 

fetchr(r.q), 
offset  (i.m). 

q 

) 

): 

xeq(rnov  r  ri(rl .r2),m.q) 
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a. 


> 

prog) 

nextmemaddr(m). 

storem) 

(eutir(rl  .q). 

atomofmemaddr)  ) 

fetchr(r2.q) 

), 

q 

) 

); 

xeq(mov  _r_rid(rl.r2.n),m,q)  - 

prog(  » 

nextmemaddr(m), 
storem( 

fetchr(rl.q), 

offset) 

n. 

atomofmemaddr) 

fetchr(r2,q)  • 


q 

) 

). 

xeq(mov  r  ridn(rl,r2,il.i2),m.q)  — 
prog) 

nextmemaddr(m), 

storem) 

fetchr(rl  ,q), 
offset) 
i2. 

indir) 

il, 

atomofmemaddr) 

fetchr(r2,q) 

) 

) 

)• 

q 

) 

): 

xeq(push  r(r.s),m.q)  = 

prog(nextmemaddr(m  ).pushstk(fetchr(r,q),s,q) ); 
xeq(push  jn(ml.s).m.q)  = 

prog(nex  tmemaddr(m),pushstk(fetchm(m  1  ,q).s,q)); 
xeq(push  _pcr(i,s).m.q)  - 
prog) 

nextmemaddr(m), 

pushstk) 

fetchm) 

offset(i.m), 

q 

). 

s, 

q 

) 

): 

xeq(push_ri(r.s),m.q) 

prog) 

nextmemaddr(m), 

pushstk) 
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fetchm( 

atomofmemaddr( 

fetchr(r.q) 

)■ 

q 

). 

S, 

q 

) 

); 

xeqfpush  _rid(r,n,s),m,q)  = 

Pr°g( 

nextmemaddr(m). 

pushstk( 

fetchm( 

offset) 

n, 

atomofmemaddr) 

fetchr(r.q) 

) 

). 

q 

), 

s. 

q 

) 

)■■ 

xeq(push_ridn(r,U  ,i2,s).m.q)  = 

Pr°g( 

nextmemaddr(m), 

pushstk) 

fetrhm) 

offset) 

i2. 

indir) 

il, 

atomofmemaddr) 

fetthr(r.q) 

) 

) 


xeq)  pushi)  v.skm.q  )  ■- 

pro*|nextmemaddr(m),pushstk(v,3,q)) 
xeqjpop  j;(s.r).m.q|  = 
pro*) 

nextmemaddr(m). 

popstk) 

s. 

storer) 

topstk(s.q). 


xeqlpop  ml  s.m  1  ).m.q)  - 
prog( 

next  memaddrlm ). 
popstkl 
s. 

storeml 

t  opst  k  (  S,q) . 
ml , 

q 


xeq(pop_pcr|s.i).m.q)  = 
prog( 

nextmemaddr(m), 

poptskl 

s, 

5torem( 

topstk(s.q). 

offset(i.m), 

q 


>: 

xeqfpop  ri(s,r],m.q)  = 
prog( 

nextmemaddr(m), 

popstk( 

s, 

storem( 

topstk(s.q). 

atomofmemaddr( 

fetchr(r.q) 


) 


q 


xeq(pop  rid(s.r,n),m.q)  = 
prog( 

nextmemaddr(m). 

popstk( 

s. 

storem) 

topstk(s.q), 

offset( 

n 

atomofmemaddr( 

fetchr(r.q) 

) 

)• 

q 

) 

) 

)• 

xeqfpop  ri'Jnfs.r.il  ,i2),m.q)  - 
pro;  j 

nex  .memaddr(m). 
popstkl 

storeml 
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topstk  ( s,q ) , 
offset  ( 

>2, 

indir( 

il, 


atomofmemaddr( 

fetchr(r,q) 


xeq(popx(s),m,q)  = 

prog(nextmemaddr(m),popstk(s,q)); 
xeq(jmp(ml),m,q)  = 
prog(ml,q); 

xeq(jmp_rni(ml),m,q)  = 

prog(atomofmemaddr(fetchin(m  1  ,q) )  ,q); 
xeq(jmpj,(r),m.q)  = 

prog(atomofmemaddr(fetchr(r,q)),q); 
xeq(bra(n),m,q)  = 

prog(offset(n,nextmemaddr(m)),q); 

xeq(bra_r  r,m,q)  = 

prog(offset(atomorint(fetchr(r,q)),nextmemaddr(in)),q) 
xeq(if(o,rl,r2,inl),m,q)  » 

pr°g( 

cond( 

applyrel( 

O, 

fetchr(rl,q), 

fetchr(r2,q) 

). 

ml, 

nextmemaddr(m) 


xeq(ifi(o,r,v,ml),m,q)  = 
pr°g( 
cond( 

applyrel( 

o, 

fetchr(r,q), 


). 

ml, 

nextmemaddr(m) 


xeq(ifte(o,rl,r2,ml,m2),m,q)  = 
prog( 
cond( 

applyrel( 

fetchr(rl,q), 

fetchr(r2,q) 
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m2 


q 

); 

xeq(iftei(o,r,v,ml.m2),m,q)  = 

Prog  ( 
cond( 

apply  rel( 
o, 

fetchr(r,q), 

v 

)> 

ml, 

m2 

), 

q 

); 

xeq(if_pcr(o,rl,r2,n),m,q)  = 
prog( 
cond( 

applyrel( 

o, 

fetchr(rl,q), 

fetchr(r2,q) 

)■ 

offset  ( n,nextmemaddr(m) ) , 
nex  tmemaddr(  m) 

). 

q 

); 

xeq(ifi_pcr(o,r,v,n),m,q)  = 
prog( 
cond( 

applyrel( 

o, 

fetchr(r,q), 

v 

). 

offset  (n,nextmemaddr(m) ) , 
nextmemaddr(m) 

). 

q 

); 

xeq(ifte_pcr(o,rl,r2,il,i2),m,q)  = 

prog( 

cond( 

applyrel( 

o, 

fetchr(rl,q), 

fetchr(r2,q) 

). 

offset(il,nextmemaddr(m}), 

offset(i2,nextmemaddr(m)) 

). 

q 

); 

xeq(iftei_pcr(o,r,v,il,i2),m,q)  = 

prog( 

cond( 

applyrel( 
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fetchr(r.q), 

v 

), 

offset(il,nextmemaddr(m)), 
offset  (i2,nextmemaddr(m)) 

). 

q 

); 

xeq(test(o,rl,ml),m,q)  = 

Pr°g( 

cond( 

appIybop{o,fetchr(rl,q)), 

ml, 

nextmemaddr(  m ) 

). 

q 

); 

xeq(testm(o,m2,ml),m,q)  = 
prog( 

cond( 

applybop(o,fetchm(m2,q)), 

ml, 

nextmemaddr(m) 

), 

q 

); 

xeq(teste(o,rl,ml,m2),m,q)  = 

prog(cond(applybop(o,fetchr(rl,q)),ml,m2),q); 
xeq(testme(o,m3,ml,m2),m,q)  = 

prog(cond(applybop(o,fetchm(m3,q)),ml,m2),q); 
xeq(test_pcr(o,rl,n),m,q)  = 

Pr°g( 

cond( 

appIybop(o,fetchr(rl,q)), 
offset(n,nextmemaddr(m)), 
nex  t  memaddr(  m ) ; 

), 

q 

); 

xeq(testm_pcr(o,m2,n),m,q)  = 

Pr°g( 

cond( 

apply  bop(o,fetchm(m2,q)), 
offset  (n,nextmemaddr(m)), 
nextmemaddr(m) 

)■ 

q 

); 

xeq(testejpcr(o,rl,il,i2),m,q)  = 
prog( 
cond( 

apply  bop(o,fetchr(rl,q)), 
offset  (il,nextmemaddr(m)), 
offset(i2,nextmemaddr(m)) 

), 

q 

); 

xeq(testme_pcr(o,m3,il,i2),m,q)  = 

Pr°g( 

cond( 

apply  bop(o,fetchm(m3,q)), 
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offset  (il,nextmemaddr(m)), 
offset  (i2,nextmemaddr(m)) 

). 

q 

xeq(stop,m,q)  =  prog(m,q)  =  q; 
xeq(jsr(ml,s),m,q)  = 

prog{m  1  ,pushstk(valofmemaddr(nextmemaddr(m) )  ,s,q) ) ; 
xeq(jsr_mi(ml,s),m,q)  = 
prog( 

atomofmemaddr(  fetchm(  m  1  ,q) ) , 
pushstk(valofmemaddr(nextmemaddr(m)),s,q) 

); 

xeq(jsr_r(r,s),m,q)  = 
prog{ 

atomofmemaddr(fetchr(  r,q) ) , 
pushstk(valofmemaddr(nextmemaddr(m)),s,q) 

); 

xeq(bsr(n,s),m,q)  = 

pr°g( 

offset  (n,nextmemaddr(m)), 
pushstk(valofmemaddr(nextmemaddr(m)),s,q) 

); 

xeq(bsr_r(r,s),m,q)  = 

Pr°g( 
offset  ( 

atomofint(fetchr(r,q)), 

nextmemaddr(m) 

). 

pushstk(valofmemaddr(nextmemaddr(m)),s,q) 

); 

xeq(rts  s,m,q)  = 

prog(atomofmemaddr(topstk(s,q)),popstk(s,q)); 
xeq(  link  (r  ,n )  ,m,q)  = 
prog( 

nextmemaddr(m), 

storer( 

valofmemaddr( 

startmemaddr(lalloc(n,q)) 

). 

r, 

storem( 

fetchr(r.q), 

startmemaddr()alloc(n,q),q) 

) 

) 

); 

xeq(unlmk(r),m,q)  = 

pr°g( 

nextmemaddr(m), 

lfree( 

getmemidj 

atomofmemaddr(fetchr(r,q)) 

)- 

storer( 

fetchm( 

atomofmemaddr(fetchr(r,q)), 


xeq(open(s),m,q)  = 
prog{ 

nextmemaddr(m) , 
openfile( 

atomofstr.char( 

topstk(s,popstk(s,popstk(s,popstk(s,q)))) 

)> 

atomoffile(top3tk(s,popstk(s,popstk(s,q))))1 

atomofmt(topstk(s,popstk(s,q))), 

atomofmt(topstk(s,q)), 

popstk(s,q) 

) 

); 

xeq(close(s),m,q)  = 
prog( 

nextmemaddr(m), 

closeflle( 

atomoffile(topstk(s,q)), 

popstk(s,q) 


); 

xeq(read(s),m,q)  = 

prog( 

nextmemaddr(m), 

storem( 

infilef 

atomoffile(topstk(s,popstk(s,q))), 

popstk(s,q) 

). 

atomofmemaddr(top3tk  (s,q) ), 
popstk(s,q) 

) 

); 

xeq(write(s),m,q)  = 

Pr°g( 

nextmemaddr(m) , 
outfile( 
fetchm( 

atomo{memaddr(topstk(s,popstk(s,q))), 

popstk(s,q) 

). 

atomoffile(topstk(s,q)), 

popstk(s,q) 


); 

end  extend; 
end  am; 


) 


APPENDIX  C:  A  SIMPLE  ASSEMBLER  FOR  AM 


1.  Introduction 

AMASM  is  an  assembler  which  produces  a  relocatable  load  module  for  AM, 
an  abstract  machine  interpreter.  This  document  constitutes  the  reference  manual 
for  Version  1.0.  It  provides  a  description  of  the  syntax  and  semantics  of  the 
assembler  as  well  as  a  description  of  the  salient  features  of  the  AM  machine  and 
a  definition  of  the  opcodes  executed  by  AM. 

AMASM  is,  to  the  extent  possible,  written  in  portable  C.  Readers  desiring 
to  port  the  code  to  16-bit  machines  may  have  to  make  slight  changes  to  "defines" 
since  long  is  assumed  to  occupy  32  bits,  and  short  16  bits. 

The  input  syntax  of  AMASM  is  similar  to  that  of  other  assemblers.  It 
supports  symbolic  addresses  and  constants  and  a  typical  set  of  directives,  but  has 
no  macro  capabilities.  The  assembler  accepts  an  ASCII  source  file  created  on  a 
conventional  text  editor  and  produces  an  output  file  containing  relocation 
information  and  AM  opcodes.  The  output  file  may  be  loaded  using  the  AM 
loader  and  executed  by  AM. 

2.  Usage 

AMASM  is  invoked  with  the  following  command  line  syntax: 
amasm  [-t]  [-1]  file  ... 

AMASM  produces  a  single  load  module  "a.vm",  which  forms  the  input  to  the 
AM  loader.  The  optional  "-t"  switch  sends  debugging  trace  to  "stdout".  The 
optional  "-1"  switch  generates  the  listing  and  crossreference  file  "a.x".  Appended 
to  this  file  is  a  hex  dump  of  "a.vm". 

3.  Lexical  Conventions 

Assembler  tokens  include  identifiers  (alternatively,  "symbols"  or  "names"), 
literal  constants,  operators  and  delimiters. 

3.1.  Identifiers 

Legal  identifiers  are  described  by  the  following  regular  expression: 

[A-Za-z  1(A-Za-zO-9  ]_* 

Identifiers  consist  of  a  letter  or  underline  n  followed  by  a  string  of  zero  or  more 
letters,  decimal  digits  and  underlines.  Upper  and  lower  case  are  distinct. 
Identifiers  may  represent  symbolic  constants,  instruction  mnemonics,  labels, 
addresses  and  type  names. 

3.2.  Operators 

The  following  are  considered  to  be  operators: 

=  =  !=<<=>  >  = 

+  -*/%&! 

The  meaning  of  the  above  symbols  varies  with  context. 
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3.3.  Literal  Constants 

Decimal  and  hexadecimal  constants  are  described  by  the  following  regular 
expressions  respectively: 

[-+][0-9]+l  [0-9]  + 

$[0-9A-Za-zj  + 

Decimal  constants  consist  of  an  optional  sign  followed  immediatly  by  one  or  more 
decimal  digits.  Hexadecimal  constants  consist  of  the  character  followed 
immediately  by  a  string  of  one  or  more  decimal  digits  and  upper  or  lower  case 
letters  "A"  through  "F".  Numeric  constants  may  represent  addresses,  integer 
and  natural  numbers,  boolean  and  character  values. 

Character  constants  consist  of  a  single  quote  followed  either  by  an  ASCII 
character  not  a  newline  or  a  numeric  constant,  followed  by  a  closing  single  quote. 

String  constants  consist  of  a  string  of  zero  or  more  ASCII  characters  (except 
newline)  enclosed  in  double  quotes. 

3.4.  Blanks 

Blanks  and  tabs  are  ignored  by  the  assembler  except  where  required  to 
separate  adjacent  constants  or  identifiers. 

3.5.  Comments 

The  character  produces  a  comment.  The  assembler  ignores  a!)  further 
characters  on  the  line  up  to  the  terminating  newline. 

3.6.  Delimiters 

All  other  characters  found  in  the  input  stream  are  treated  as  delimiters. 

4.  Statements 

A  source  program  is  composed  of  a  sequence  of  statements  which  are 
separated  by  newlines.  There  are  3  kinds  of  statements:  directives,  instructions 
and  null. 

Instructions  and  null  statements  may  be  preceded  by  a  label.  Directives  may 
(in  some  cases,  must)  be  preceded  by  an  identifier. 

4.1.  Labels  &  Identifiers 

A  label  consists  of  an  identifier  followed  by  a  colon  When  the  assembler 
encounters  a  label,  the  effect  is  to  assign  the  current  value  of  the  location  counter 
to  the  name. 

An  identifier  preceding  a  directive  is  assigned  a  value  whose  type  depends 
upon  the  directive.  For  instance,  the  equate  directive  assigns  a  typed  value  to 
an  identifier,  while  the  define  storage  directive  assigns  the  current  value  of  the 
location  counter. 

Neither  labels  nor  identifiers  may  be  redefined  within  a  single  source  file. 

4.2.  Null  Statements 

A  null  statement  is  an  empty  statement.  Although  ignored  by  the  assembler, 
null  statements  may  be  preceded  by  a  label. 


4.3.  Directive  Statements 

A  directive  is  a  command  to  the  assembler  to  perform  some  sort  of  operation 
which  does  not  involve  emitting  an  executable  instruction.  Typical  directives 
(also  known  as  "pseudo  ops"  or  "pseudo  instructions")  allocate  storage  for 
variables,  make  names  within  the  current  module  visible  to  other  modules  and  set 
the  location  counter.  Directives  also  produce  instructions  for  the  AM  linker  and 
loader. 

Directives  consist  of  a  keyword  followed  by  zero  or  more  arguments, 
depending  upon  the  context.  Directives  and  their  syntax  are  described  in  more 
detail  in  Section  11. 

4.4.  Instruction  Statements 

Instruction  statements  produce  the  code  which  is  ultimately  executed  by  AM. 
An  instruction  may  be  preceded  by  a  label,  and  consists  of  a  keyword  followed  by 
zero  or  more  arguments,  depending  upon  context. 

The  AM  instruction  set  and  its  syntax  will  be  described  in  detail  in  Section 
13. 

5.  The  Machine 

Because  AM  differs  from  conventional  machines  in  a  number  of  important 
ways,  some  discussion  is  necessary  before  introducing  the  instruction  set. 
Outwardly  similar  to  a  number  of  well  known  examples,  AM  instructions  form  an 
unconventional  set  of  primitive  operations  which  implement  a  formally  specified 
semantics.  The  reasons  for  this  are  described  below. 

AM  uses  a  tagged  architecture.  Thus,  each  data  element  contains,  within  it. 
information  which  uniquely  identifies  a  finite  set  of  legal  operations  which  may  be 
performed  upon  it,  as  well  as  a  range  of  legal  values  it  may  take  on.  This  set  of 
operations  and  values  is  known  formally  as  a  data  type.  AM  supports  a  number 
of  data  types.  An  element  of  a  particular  data  type  will  be  referred  to 
throughout  the  rest  of  this  manual  as  an  atom. 

AM  physical  resources  are  partitioned  into  segments.  There  are  several 
types  of  segments,  and  these  together  form  a  conventional  overall  model  of  the 
familiar  stored  program  computer.  There  are  memory  segments  (primary 
storage),  register  segments  (high-speed  memory),  stacks,  and  file  segments 
(secondary  storage).  Segments  are  further  partitioned  into  discrete,  addressable 
elements  (alternatively,  "cells")  which  will  contain  atoms  during  the  execution  of 
a  program.  These  elements  will  be  referred  to  repeatedly  as  typed  values.  The 
reason  for  the  distinction  between  atoms  and  values  will  become  more  clear 
shortly. 

AM  is  the  finite  implementation  of  a  formal  specification.  As  such,  data 
elements  and  the  operations  which  can  be  applied  to  them  must  reflect  a 
mathematical  consistency  not  required  by  conventional  architectures.  Since  all 
operations  which  affect  the  state  of  the  machine  must  be  able  to  "communicate" 
with  each  other  during  the  execution  of  a  AM  program,  they  must  do  so  using  a 
common  object.  This  object  is  a  value.  The  memory,  the  registers,  the  stack, 
the  files  all  hold  values.  Store,  fetch,  execute,  read,  write  --  any  operations  which 


change  the  state  of  the  machine  —  all  operate  on  values  (i.e.,  storage  cells).  All 
other  operations,  such  as  "add",  "multiply",  "and",  "or",  work  on  atoms. 
Atomic  operations  in  AM  correspond  to  those  which  take  place  in  the  temporary 
registers  of  the  arithmetic  and  logic  unit  of  a  conventional  processor. 

5.1.  Configuration 

A  unique  feature  of  AM  is  the  ease  with  which  it  is  possible  to  reconfigure 
the  machine  by  partitioning  the  physical  resources  in  different  ways.  A  typical 
configuration  would  be  something  like  this: 

2  memory  segments 

1  register  segment  (with  a  useful  number  of  registers) 

1  stack 
4  files 

The  configuration  chosen  should  provide  a  good  indication  of  the  types  of 
programs  AM  is  intended  to  execute. 

Note  that,  in  conventional  machines,  stacks  are  implemented  in  primary 
storage.  This  constitutes  an  overloading  of  data  structures  which  obscures  the 
intent  of  the  user  of  these  structures.  It  also  creates  a  semantic  nightmare  for  the 
specification  writer.  In  AM,  stacks  take  their  rightful  places  as  separate  entities 
with  easy  to  understand  properties. 

In  addition  to  the  resources  listed  above,  AM  has  a  conventional  program 
counter. 

5.1.1.  Memory 

AM  memory  is  partitioned  into  segments  which  may  be  of  unequal  but  fixed 
length.  A  program  and  its  data  will  reside  in  memory  segments.  It  is  not 
necessary  that  code  and  data  share  the  same  segment,  nor  is  it  required  that  code 
and  data  be  contiguous.  The  loader  will  determine  from  the  origin  directive 
where  to  load  code  and  data  values. 

The  AM  heap  is  implemented  as  a  set  of  operations  which  allocate  and 
deallocate  memory  segments. 

AM  has  a  rich  set  of  addressing  modes  which  interact  with  a  powerful  move 
instruction  which  allows  the  programmer  to  move  a  value  from  "anywhere  to 
anywhere". 

5.1.2.  Registers 

AM  registers  form  the  high-speed  storage  into  which  operands  are  placed. 

All  atomic  operations,  such  as  add  and  divide,  require  operands  to  be  in 
registers. 

5.1.3.  Stack 

The  AM  stack  is  conventional  in  every  respect  except  that  it  is  impossible  to 
access  any  value  except  the  top.  Thus,  frames  are  implemented  on  the  heap,  not 
the  stack. 

AM  has  a  typical  set  of  push  and  pop  instructions  for  operating  on  stacks. 
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5.1.4.  Files 

Input/output  is  implemented  rather  arbitrarily  along  the  lines  of  system  calls 
to  an  operating  system  and  should  not  be  considered  part  of  AM  itself. 
Instructions  are  provided  to  open,  close,  read  to  and  write  from  a  file. 

6.  Atoms 

An  atom  is  a  component  of  a  data  type.  The  assembler  recognizes  the 
following  types  of  atoms: 

boolean 

natural 

integer 

character 

string 

memory  address 
register  address 
stack  address 
file  address 

As  operands  to  instruction  mnemonics,  these  atoms  form  the  familiar  set  of  literal 
and  symbolic  constants  found  in  typical  assembly  language  programs. 

Atoms  may  appear  in  the  form  literal  constants: 
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SdOfl 

'a ' 

’’this  is  a  string  atom" 

They  may  also  appear  as  symbols  which  take  on  the  value  of  the  atom  in  some 
other  part  of  the  source  program.  With  few  exceptions,  anywhere  a  literal 
constant  may  be  used,  a  symbolic  constant  of  the  appropriate  type  may  also  be 
used. 

The  assembler  distinguishes  between  types  of  atom  using  syntax  and  context. 
The  syntax  is  described  below. 

6.1.  Boolean 

A  boolean  atom  has  only  two  values,  true  and  false.  These  values  are 
represented  to  the  assembler  by  the  decimal  or  hexadecimal  constants  for  1  and 
0.  respectively. 

0 

1 

$1 

$0 

are  legal  boolean  atoms. 

6.2.  Natural 

This  type  represents,  as  the  name  implies,  the  natural  (unsigned)  numbers. 
Legal  values  range  from  zero  to  positive  infinity.  Natural  numbers  are 
represented  to  the  assembler  as  decimal  or  hexadecimal  constants  whose  values 
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I 


•  •  1 
] 

— -4 


- 


are  greater  than  or  equal  to  zero. 

0 

$2f5 

240 

are  legal  natural  atoms. 

6.3.  Integer 

Integers  range  from  negative  to  positive  infinity,  and  are  specified  as 
hexadecimal  or  signed  or  unsigned  decimal  constants. 

-250 

0 

$ed67f 
+  10 

are  legal  integer  atoms. 

6.4.  Character 

Character  atoms  may  take  values  defined  by  the  ASCII  character  set.  They 
are  represented  to  the  assembler  as  literal  character  constants. 

'a' 

'r ' 

are  legal  character  atoms. 

6.5.  String 

String  atoms  are  composed  of  zero  or  more  concatenated  ASCII  characters. 
They  are  specified  as  literal  strings. 

"this  is  a  legal  string  atom" 

III! 

are  both  legal  string  atoms. 

6.6.  Memory  Address 

Memory  address  atoms  consist  of  two  components:  a  segment  address,  and  an 
element  address.  Memory  addresses  are  represented  as  an  ordered  pair  of 
unsigned  decimal  or  hexadecimal  constants,  separated  by  a  colon  and  enclosed 
within  parentheses  "("  ")". 

(0:100) 

represents  memory  segment  0,  element  100. 

(2:$10) 

represents  segment  2.  element  16. 

Segment  and  element  addresses  start  at  0.  The  number  and  size  of  available 
memory  segments  depends  upon  the  current  configuration  of  AM. 

Labels  are  considered  memory  address  atoms,  as  are  names  which  appear  to 
left  of  the  define  storage  and  define  constant  directives. 
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6.7.  Register  Addresses 

Register  atoms  have  a  syntax  identical  to  that  of  memory  addresses  except 
that  a  lower  case  "r"  is  prepended  to  the  address. 

r(0:3) 

refers  to  register  segment  0,  register  3. 

Segment  and  element  addresses  start,  as  with  memory  addresses,  at  0.  The 
number  of  register  segments,  and  the  number  of  registers  within  each  segment, 
varies  as  determined  by  the  current  AM  configuration. 

6.8.  Stack  Addresses 

A  stack  address  has  only  one  component:  the  segment  address.  Stack 
addresses  are  specified  by  prepending  a  lower  case  "s"  to  an  unsigned  decimal  or 
hexadecimal  constant  enclosed  within  parentheses. 

«(2) 

refers  to  stack  segment  2. 

Stack  addresses  begin  at  0.  The  number  of  stacks  depends  upon  AM's 
configuration. 

6.9.  File  Addresses 

File  address  atoms  may  not  appear  in  a  program  except  within  typed  values. 
File  address  atoms  are  represented  as  unsigned  integer  or  hexadecimal  constants. 

File  addresses  start  at  0.  The  number  of  files  which  may  be  open  at  one  time 
is  determined  by  the  current  AM  configuration.  The  first  three  file  addresses 
(0,1,2)  are  normally  opened  automatically  by  AM  when  a  program  is  loaded. 

7.  Typed  Values 

Some  of  the  atomic  types  may  also  appear  as  typed  values  in  certain 
instructions  and  directives.  A  typed  (immediate)  value  is  represented  as  an 
ordered  pair  consisting  of  a  keyword  representing  the  type,  and  the  atom  itself, 
separated  by  a  comma  and  enclosed  within  curly  braces 

{int ,  100} 

represents  the  integer  value  100. 

{addr, (1:100)} 

-“presents  memory  address  value  (1:100). 

A  list  of  the  types  which  may  be  used  as  immediate  values  alongside  the 
.responding  keywords  appears  below: 

bool  -  boolean 
nat  -  natural 
;"t  -  integer 

char  -  character 
string  -  character  string 
addr  -  memory  address 
file  -  file  address 
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12.9.  Immediate  Atom 

The  operand  is  an  atom. 

Syntax:  A 

A  -  usually  an  integer  or  natural  • 

Format: 

tag  1  val 


12.10.  Stack  Direct 

The  operand  is  a  stack. 

Syntax:  S 
Format: 

[POOF]  [HHHHHHTiTT 


13.  Instruction  Set 

The  AM  instruction  set  is  simple  but  powerful.  The  rigid  data  types  make  it 
meaningless  to  specify  operations  like  shift  and  mask,  thus  removing  some  of  the 
programmer’s  freedom  to  muck  with  data  in  arbitrary  ways.  The  tagged 
architecture  will  detect  errors  like  jumping  to  data,  or  accessing  instructions  as 
data,  as  well  as  the  more  common  bounds  checking  performed  by  runtime 
libraries. 

13.1.  Machine  Errors 

The  following  errors  are  detected  by  AM  during  loading  and  execution: 

-  attempt  to  execute  a  non-instruction 

-  attempt  to  execute  an  illegal  instruction 

-  memory  segment  not  defined 

-  memory  segment  overflow 

-  memory  segment  underflow 

-  register  segment  not  defined 

-  register  segment  underflow 

-  register  segment  underflow 

-  stack  segment  not  defined 

-  <file>  contains  unresolved  references 

-  attempt  to  convert  negative  int  to  nat 

-  no  predecessor  to  zeronat 

-  unknown  operator  to  applybop 

-  unknown  operator  to  applymop 

-  unknown  operator  to  applydop 

-  unknown  operator  to  applyrelop 

-  type  error  -  GT 

-  type  error  -  GE 

-  type  error  -  LT 

-  type  error  -  LE 

-  no  more  segment  available' 
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A 


A 


A 


Syntax:  RNqI 

R  -  holds  the  current  frame  pointer 
N  -  a  non-negative  frame  reference 
I  -  an  integer  frame  displacement 

(R0(@I  is  equivalent  to  Rod) 

Format: 

[0002]  jHHHHHHEH]  [00031  [HHHHHHHH  | 

12.5.  Memory  Absolute 

Syntax:  M 

M  -  the  operand  address 

Format: 

0009~ 

12  6.  Memory  Indirect 

The  address  of  the  operand  is  in  a  memory  cell. 

Syntax:  M§ 

M  -  a  pointer  to  the  operand  address 
Format: 

12.7.  Program  Counter  Relative 

The  address  of  the  operand  is  the  sum  of  the  program  counter  and  an  integer 
displacement. 

Syntax:  M 

M  -  the  operand  address 

The  specified  address  must  be  in  the  same  module  as  the  instruction.  The 
assembler  automatically  computes  the  displacement.  Program  counter  relative  is 
specified  for  a  block  by  placing  a  rorg  directive  at  the  top  of  the  block. 

Format: 


12.8.  Immediate  Value 

The  operand  is  an  immediate  value. 

Syntax:  V 

V  -  any  typed  value 


Format: 


12.  Addressing  Modes 

AM  supports  10  addressing  modes: 

r  -  register  direct 

ri  -  register  indirect 

rid  -  register  indirect  with  displacement 

ridn  -  n-level  register  indirect  with  displacement 

m  -  memory  absolute 

mi  -  memory  indirect 

per  -  program  counter  relative 

i  -  immediate  value 

a  -  immediate  atom 

s  -  stack  direct 

Like  other  more  familiar  processors,  not  all  AM  instructions  can  use  all  of  the 
addressing  modes. 

In  addition,  AMASM  supports  address  expressions,  which  provides  a 
rudimentary  indexing  capability. 

12.1.  Register  Direct 

The  operand  is  in  a  register. 

Syntax:  R 
Format: 


12.2.  Register  Indirect 

The  address  of  the  operand  is  in  a  register. 

Syntax:  R@ 

R  -  holds  the  operand  address 
Format: 


12.3.  Register  Indirect  with  Displacement 

The  address  of  the  operand  is  the  sum  of  the  address  in  a  register  and  an 
integer  displacement. 

Syntax:  RSI 

R  -  holds  a  base  address 

I  -  an  integer  displacement 

Format: 


12.4.  N-level  Register  Indirect  with  Displacement 

The  address  of  the  operand  is  the  sum  of  the  address  obtained  from  the  nth 
link  in  a  chain  of  dynamic  links  and  an  integer  displacement. 


EXTERN 


External  Symbol 


EXTERN 


Syntax: 

extern  <name>... 
where: 

<name>  is  any  legal  identifier 

Description: 

The  list  of  symbols  is  made  visible  to  the  current  module  and  are  assumed  to 
be  defined  elsewhere.  An  error  is  flagged  if  a  symbol  in  the  list  is  not  referenced 
somewhere  within  the  current  module.  It  is  also  an  error  for  any  symbol  in  the 
list  to  be  defined  within  the  current  module. 

Example: 

extern  expon 

push  (int,100},s(0) 
jsr  expon, s(0) 


Format: 

For  each  symbol  declared  external,  an  extern  pseudo  op  is  emitted,  followed 
by  a  string  containing  the  symbol. 

[0l90~ 


T8F2I  fooo5i  rjnrroo 
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GLOBL 


Global  Symbol 


GLOBL 


Syntax: 

globl  <name> ... 

where: 

<name>  is  any  legal  identifier 

Description: 

The  list  of  symbols  is  made  visible  to  external  modules.  Each  <name>  in 
the  list  must  be  defined  as  a  memory  address  somewhere  within  the  current 
module. 

Example: 

globl  test, data 

test: 

move  (0:0)  ,r (0:0) 
stop 

data  ds  10 

"test"  and  "data"  are  made  visible  to  other  modules. 

Format: 

For  each  symbol  declared  global,  a  globl  pseudo  op  is  emitted,  followed  by  a 
string  containing  the  symbol,  followed  by  a  memory  address  representing  the 
value  of  the  symbol. 


DC 


Define  Constant 


DC 


Syntax: 

[<name>]  dc  V... 
where: 

<name>  is  an  optional  identifier 

dc  permits  a  list  of  atoms  to  follow  the  type  keyword  of  each  value. 

Description: 

dc  allocates  and  initializes  storage  from  a  list  of  values  starting  at  the 
current  value  of  the  location  counter. 

Example: 

data3  dc  {char, ’a', ’b’} 

dc  {string, "this  is  a  string  value"} 

The  first  ds  shown  allocates  2  character  values. 

The  second  allocates  a  single  string  value.  No  identifer  was  specified. 

Format: 

A  typed  value  is  emitted  for  each  value  in  the  list. 


DS 


Define  Storage 


DS 


Syntax: 

[<name>]  ds  N  [V...] 

[<name>]  ds  [N]  V... 

where: 

<name>  is  an  optional  identifier 

ds  permits  a  list  of  atoms  to  follow  the  type  keyword  of  each  value. 

Description: 

ds  allocates  storage  for  values  starting  at  the  current  value  of  the  location 
counter. 

-  If  N  is  specified  and  N  is  greater  than  or  equal  to  the  number  of  values  in 
the  list,  space  for  N  values  is  allocated  and  the  location  counter  is 
incremented  by  N. 

-  If  N  is  specified  and  N  is  less  than  the  number  of  values  in  the  list.  X  is 
ignored. 

-  If  N  is  not  specified,  the  amount  of  storage  allocated  is  equal  to  the 
number  of  values  in  the  list.  The  location  counter  is  incremented  by  this 
number. 

-  If  a  value  list  is  specified,  the  allocated  cells  will  be  initialized  to  those 
values,  beginning  with  the  first. 

-  Cells  allocated  but  not  initialized  are  considered  to  hold  undefined  values. 
It  is  an  error  to  attempt  to  read  an  undefined  value. 

Example: 


datal 

ds 

10 

data2 

ds 

10  {int,100},{nat,0,20,40} 

data3 

ds 

{char, ‘a’, 'b'} 

ds 

{string, "this  is  a  sting  value"} 

The  first  ds  allocates  10  values  and  leaves  them  undefined,  "datal"  may  be 
used  to  index  into  those  values. 

The  second  also  allocates  10  values,  but  initializes  the  first  to  the  integer 
100,  and  the  next  3  to  the  naturals  0.  20,  and  40  The  last  6  values  are  left 
undefined. 

The  third  ds  shown  allocates  2  character  values. 

The  fourth  allocates  a  single  string  value.  No  identifer  was  specified. 

Format: 

A  typed  value  is  emitted  for  each  value  in  the  list.  In  addition,  ds  will  emit 
an  org  pseudo  op  (see  org  )  whenever  the  number  of  values  in  the  value  list  is 
less  than  N. 
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RORG 


Relative  Origin 


RORG 


Syntax: 
rorg  [M] 

Description: 

The  location  counter  is  reset  to  M,  if  specified;  otherwise  it  remains 
unchanged.  All  memory  addresses  and  labels  specified  after  a  rorg  directive  up 
to  the  next  org  or  rorg  directive  are  computed  as  displacements.  Code 
generated  after  a  rorg  directive  up  to  the  next  org  or  rorg  directive  is 
relocatable  (program  counter  independent). 

Example: 

rorg 

move  {int, 100}, data 

jsr  stuff 

stop 

data  ds  10 

In  the  above  example,  the  move  would  be  emitted  using  destination 
program  counter  relative  addressing. 


Format: 


ORG 


Absolute  Origin 


ORG 


Syntax: 

org  [Mj 

Description: 

The  location  counter  is  reset  to  M,  if  specified;  otherwise  it  remains 
unchanged.  All  memory  addresses  and  labels  specified  after  an  org  directive  up 
to  the  next  org  or  rorg  directive  not  exlicitly  expressed  as  displacements  are 
treated  as  absolute  addresses.  Code  generated  after  an  org  directive  up  to  the 
next  org  or  rorg  directive  is  not  relocatable. 

Example: 

org 

move  (0:0),r(0:0) 
org  (1:0) 

data  ds  {int,100},{nat,0} 

Format: 


EQU 


Equate 


EQU 


Syntax: 

<name>  equ  <equivalence> 
where: 

<name>  is  any  legal  identifier 
< equivalence >  is  any  atom  or  typed  value 

Description: 

The  symbol  <name>  is  assigned  the  value  of  <equivalence>.  Elsewhere  in 
the  source  module,  the  symbol  may  be  used  in  place  of  a  literal  value  of  the  same 
type  as  <equivalence>  using  the  following  syntax: 

-  If  the  symbol  represents  a  memory  address  atom,  the  symbol  may  be  used 
directly. 

-  If  the  symbol  represents  a  typed  (immediate)  value,  it  must  be  enclosed  in 
curly  braces  "{" 

-  If  the  symbol  represents  an  integer  or  natural  atom,  it  must  be  preceded  by 
a  pound  sign 


Example: 

progseg 

equ 

(0:0) 

dataseg 

equ 

(1:100) 

offset 

equ 

10 

datafile 

equ 

{file, 3} 

org 

progseg 

move 

{addr, data}, r  (0:0) 

move 

{int, 100}, r(0:0)@f  offset 

push 

{string, "test. dat"},s(0) 

push 

{datafile},  s(0) 

push 

{int,0},s(0) 

push 

{int,0},s(0) 

open 

s(0) 

stop 

org 

dataseg 

data 

ds 

100 

"progseg"  and  "dataseg"  are  equated  to  memory  address  atoms, 
"offset"  is  equated  to  the  integer  atom  10. 

"datafile"  is  equated  to  the  file  address  value  {file, 3}. 


Ill 


stack  address 


5  - 

-[ 

o 

50 

t-H 

o 

ihhHhHHhh 

0170 

hhhhhhhhi 

0180 

|  HHHHHHHiT 

instruction 


01  AO 

HHHHHI 

0190 

f'HHHH- 

|  zero  or  more  operand  atoms 


10.3.  Object  Module  Format 

The  structure  of  an  object  module  is  very  simple.  The  only  object  always 
found  is  a  leading  org  directive.  Next,  if  any  symbols  were  declared  global  or 
external  in  the  source  module,  a  pseudo  instruction  will  be  emitted  for  each  such 
symbol.  The  rest  of  the  file  contains  executable  and  pseudo  instructions  emitted 
as  they  occur  in  the  source. 


11.  Assembler  Directives 

AMASM  recognizes  the  following  directives: 

equ  -  equate 
org  -  absolute  origin 
rorg  -  relative  origin 
extern  -  external  symbol 
globl  -  global  symbol 
ds  -  define  storage 
dc  -  define  constant 

Directives  do  not  produce  code  which  will  be  executed  by  AM,  but  they  may 
cause  linker/loader  instructions  to  be  emitted.  The  meaning  and  syntax  of  each 
directive  is  described  in  the  following  pages. 


AM’s  tagged  architecture.  The  following  conventions  will  apply: 

-  All  numbers  show  are  in  hexadecimal. 

-  The  letter  "H"  is  a  place  holder  signifying  any  4-bit  value. 

-  The  general  form  of  a  typed  value  is 
val 

16-bit  type  field,  and  "val"  is  an  8  to  32-bit  value. 


tag 


where  "tag" 

There  are  two  exceptions: 

-  Character  string  atoms  and  values  have  a  16-bit  size  field  inserted  after  the 
type  field  which  indicates  the  number  of  characters  in  the  value  field 
(including  the  terminating  null).  This  size  field  is  omitted  in  memory  (since 
it  is  not  needed),  replaced  by  a  pointer  to  the  string.  Both  the  size  field  and 
pointer  will  be  omitted  in  the  format  diagrams. 

-  Instruction  values  have  a  16-bit  opcode  following  the  type  field,  followed 
by  a  list  of  operand  values. 

A  number  of  the  formats  listed  below  are  not  described  elsewhere  in  this 
manual  since  they  are  either  not  accessible  to  the  programmer,  or  are  implied  by 
context. 

10.1.  Atom  Formats 
boolean  -  [0001  ]  |  HH  | 

mmmmnr 


natural  -  10002 


integer  -  [  0003  ]  |  HHHHHHHH 
character 


OOP?)  [HH1 


character  string  -  [ 0005  1  |  HH...00  j 
memory  address 


[0009" 

[ HHHHHHHH | 

000 A 

HHHHHHHH 

stack  address  -  [OOOB  1  1  HHHHHHHH 


file  address  -  1 00 IT]  fHHHH 


monadic  operator  -  1 000C  |  |  HHHH 
dyadic  operator  -  [000D  HHHH 


relational  operator  -  |000H~[  1HHHH  | 
boolean  comparitor  -  j  0012  ]  |  HHHH 


10.2.  Value  Formats 


boolean  -  0110 

m: 

natural  - 1 0120  |  | 

HHHHHHHH | 

integer  - 1 0130  |  | 

HHHHHHHH  | 

character  -  0140 

]®] 

character  string  -  [0150  j  [ HH...00~j 


Immediate  values  are  used,  as  in  conventional  assembly  languages,  for  loading 
constants  into  cells,  initializing  storage,  pushing  parameters  to  subroutines  on  the 
stack,  and  so  on. 

A  special  syntax  may  be  applied  when  expressing  typed  values  for  the  define 
storage  and  define  constant  directives.  The  type  keyword  may  be  followed  by 
a  list  of  atoms  of  the  appropriate  type,  separated  by  commas. 

{int, 1,2, 3, 4, 5, 6, 7, 8} 
shows  an  example  of  this. 

8.  Expressions 

An  expression  may  be  substituted  anywhere  an  integer  or  natural  atom  is 
called  for.  The  expression  must  be  a  sequence  of  integer/natural  atoms  (and 
symbolic  constants  equated  to  integer/natural  atoms)  separated  by  operators  and 
grouping  symbols  which  evaluates  to  an  atom  of  the  type  called  for  where  the 
expression  is  used. 

8.1.  Expression  Operators 

Legal  operators  are  (in  order  of  increasing  precedence): 

|  -  or 

&:  -  and 

+  -  -  addition  and  subtraction 

*  /  %  -  multiplication,  division,  and  modulus 
-  unary  minus 

Expressions  may  be  grouped  using  parentheses  "("  ")". 

9.  Notation 

Throughout  the  rest  of  this  manual,  the  following  notational  conventions  will 
be  used  to  describe  the  syntax  of  directives  and  instructions. 

M  -  memory  address  atom 

R  -  register  address  atom 

S  -  stack  address  atom 

I  -  integer  atom 

N  -  natural  atom 

A  -  atom 

V  -  typed  value 

<  >  -  items  enclosed  within  angle  brackets  are  arguments 

[  ]  -  items  enclosed  in  square  brackets  are  optional 

<ea>  -  effective  address 
<ev>  -  effective  value 

10.  Data  Format 

AMASM  emits  object  code  and  directives  using  AM  I/O  modules.  The 
object  module  is,  thus,  directly  readable  by  AM.  A  linker  and  loader  may  be 
written  either  in  a  high  level  language,  or  AM  assembler. 

The  data  and  object  module  formats  described  below  are  a  direct  reflection  of 


-  attempt  to  free  invalid  memory  segment 

-  attempt  to  free  non-allocated  segment 

-  stack  empty 

-  stack  overflow 

-  stack  underflow 

-  file  already  open 

-  unable  to  close  file 

-  unable  to  open  <file> 

-  file  already  closed 

-  file  not  open 

-  file  not  open  for  reading 

-  file  not  open  for  writing 

-  reading  file,  type  not  recognized 

-  error  reading  file 

-  writing  file,  type  not  recognized 

-  invalid  memory  segment 

-  memory  segment  not  allocated 

-  invalid  memory  address 

-  invalid  register  segment 

-  invalid  register  address 

-  invalid  stack  segment 

-  invalid  file  descriptor 

-  attempt  to  return  head  of  null  string 

-  value  not  of  type  bool 

-  atom  not  of  type  bool 

-  value  not  of  type  int 

-  atom  not  of  type  int 

-  value  not  of  type  nat 

-  atom  not  of  type  nat 

-  value  not  of  type  char 

-  atom  not  of  type  char 

-  value  not  of  type  string 

-  atom  not  of  type  string 

-  value  not  of  type  memaddr 

-  atom  not  of  type  memaddr 

-  value  not  of  type  regaddr 

-  atom  not  of  type  regaddr 

-  value  not  of  type  stkaddr 

-  atom  not  of  type  stkaddr 

-  value  not  of  type  instr 

-  atom  not  of  type  instr 

-  value  not  of  type  file 

-  atom  not  of  type  file 

-  type  error 

All  machine  errors  are  fatal. 


13.2.  A  ssembler  Errors 

AMASM  will  detect  and  report  the  following  errors: 

-  symbol  not  an  address 

-  symbol  defined  locally 

-  <symbol>  does  not  match  declared  type 

-  relative  memory  indirect  not  permitted 

-  symbol  not  a  value 

-  symbol  not  an  integer 

-  symbols  declared  but  not  referenced 

-  displacement  from  external  addresses  not  permitted 

-  relative  addressing  not  permitted  between  segments 

-  out  of  symbol  space 

-  symbol  declared  external 

-  symbol  already  defined 

-  symbol  not  of  same  type 

-  impossible  value  for  given  type 

-  syntax  error 

Assembler  errors  are  not  fatal,  but  will  prevent  the  creation  of  the  object 
module  and,  usually,  the  cross-reference  file. 

13.3.  AM  Operations 

AM  supports  a  useful  set  of  monadic,  dyadic,  relational  and  test  operators. 
These  operators  are  to  be  used  with  the  monad,  dyad,  if  and  test  insructions. 
The  mnemonics/symbols  for  each  operator  along  with  the  data  types  to  which 
each  may  be  applied  are  described  below. 

13.3.1.  Dyadic  Operators  (DOP's) 
cat  -  string  concatenation 

cat  accepts  two  string  arguments  and  returns  the  concatenation  of  the  first 
onto  the  second. 

add,sub,mul,div  -  computational  operators 

These  operators  accept  integer  or  natural  arguments  (both  of  the  same  type) 
and  return  a  result  of  that  type.  Divide  by  zero  returns  an  error,  div 
discards  any  remainder. 

and, or 

and  and  or  accept  two  boolean  arguments  and  return  a  boolean  result. 

13.3.2.  Monadic  Operators  (MOP 's) 
len  -  string  length 

len  accepts  a  string  and  returns  its  length  as  a  natural  number. 


not  -  boolean  negation 

not  accepts  a  boolean  argument  and  returns  its  negation. 

make  -  make  a  string 

This  operator  accepts  a  character  argument  and  returns  a  string  of  length  1. 

head  -  the  head  of  a  string 

This  operator  accepts  a  string  and  returns  the  character  at  its  head.  It  is  an 
error  to  take  the  head  of  an  empty  string. 

tail  -  the  rest  of  a  string 

tail  accepts  a  string  and  returns  a  string  containing  all  but  the  first 
character.  The  tail  of  an  empty  string  is  the  empty  string. 

13.3.3.  Relational  Operators  (RELOP 's) 

The  relational  operators  are: 

=  =  -  equality 

>  -  greater  than 

>=  -  greater  than  or  equal  to 

<  -  less  than 

<=  -  less  than  or  equal  to 

!=  -  not  equal  to 

They  may  be  applied  to  int,  nat,  char  and  string. 

If  ==  or  are  applied  to  arguments  of  different  types,  ==  returns  true,  != 
return  false.  This  applies  also  to  types  not  listed  above.  >,>  =  ,<  and  <=  return 
an  error  if  there  arguments  are  not  of  the  same  type. 

Relational  operators  return  a  boolean  result. 

13.3.4.  Test  Operators  (BOP's) 

These  operators  permit  the  programmer  to  test  a  cell  for  type  before 
attempting  to  access  it.  These  are  necessary  because  AM  considers  it  a  fatal 
error  to  read  from  an  undefined  cell  or  apply  an  operator  of  one  type  on  data  of 
another.  The  test  operators  are  the  same  as  the  type  mnemonics,  plus  a 
mnemonic  for  testing  undefined  values: 

bool 

nat 

int 

char 

string 

instr 

addr 

file 

undef 
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Test  operators  accept  a  typed  value  and  return  true  if  the  value  is  of  the  specified 
type,  false  otherwise,  undef  returns  true  if  a  value  is  undefme.  false  otherwise. 


DYADS 


Dyadic  Short 


DYADS 


Syntax: 

<dop>  Rx.Ry 

where: 

<dop>  is  a  dyadic  operator 

Operation: 

Ry  <dop>  Rx  —  >  Ry 

Description: 

The  operation  corresponding  to  <dop>  is  applied  to  the  operands  and  the 
result  stored  in  Ry. 

Example: 

and  r(0:0),r(0:l) 

Addressing  Modes: 

Rx:  r 
Ry:  r 

Format: 


[0190  1  [480T]  [operands 


DYADSI 


Dyadic  Short  Immediate 


DYADSI 


Syntax: 

<dop>  V,R 

where: 

<dop>  is  a  dyadic  operator 

Operation: 

R  <dop>  V  — >  R 

Description: 

The  operation  corresponding  to  <dop>  is  applied  to  the  operands  and  the 
result  stored  in  R. 

Example: 

sub  {int,100},r(0:l) 

Addressing  Modes: 

V:  i 
R:  r 

Format: 

[01901 


48021  |  operands" 
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DYAD 


Dyadic  Long 


DYAD 


Syntax: 

<dop>  Rx,Ry,Rz 
where: 

<dop>  is  a  dyadic  operator 

Operation: 

Ry  <dop>  Rx  -->  Rz 

Description: 

The  operation  corresponding  to  <dop>  is  applied  to  Rx  and  Ry  and  the 
result  stored  in  Rz. 

Example: 

add  r(0:0),r(0:l),r(0:3) 

<dop>  Rx,Ry,Ry  is  equivalent  to  <dop>  Rx,Ry 

Addressing  Modes: 

Rx:  r 
Ry:  r 
Rz:  r 

Format: 

10190  j  1 4803  I  I  operands  1 


DYADI 


Dyadic  Long  Immediate 


DYADI 


Syntax: 

<dop>  Y.Rx.Ry 

where: 

<dop>  is  a  dyadic  operator 

Operation: 

Rx  <dop>  V  -->  Ry 

Description: 

The  operation  corresponding  to  <dop>  is  applied  to  V  and  Rx  and  the  result 
stored  in  Ry. 

Example: 

add  {int,100},r(0:0),r(0:l) 

<dop>  V,Rx,Rx  is  equivalent  to  <dop>  V,Rx 

Addressing  Modes: 

V:  i 
Rx:  r 
Ry:  r 

Format: 

1 0190  I  [4804  I  I  operands  I 


MONADS 


Monadic  Short 


MONADS 


Syntax: 

<mop>  R 

where: 

<mop>  is  a  monadic  operator 

Operation: 

<mop>  R  -->  R 

Description: 

The  operator  corresponding  to  <mop>  is  applied  to  R  and  the  result  stored 
in  R. 

Example: 

n  A  r(0:0) 

Addressing  Modes: 

R:  r 

Format: 
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MONAD 


Monadic  Long 


MONAD 


Syntax: 

<mop>  Rx.Ry 

where: 

<mop>  is  a  monadic  operator 

Operation: 

<mop>  Rx  -->  Ry 

Description: 

The  operator  corresponding  to  <mop>  is  applied  to  Rx  and  the  result  stored 
in  Ry. 

Example: 

not  r(0:0),r(l:0) 

Addressing  Modes: 

Rx:  r 
Ry:  r 


Format: 


MON  ADI 


Monadic  Long  Immediate 


MO  NADI 


Syntax: 

<mop>  V.R 

where: 

<mop>  is  a  monadic  operator 


Operation: 

<mop>  V  —  >  R 


Description: 

The  operator  corresponding  to  <mop>  is  applied  to  the  immediate  value  V 
and  the  result  stored  in  R. 

Example: 

not  {bool, flag}, r(l:0) 


Addressing  Modes: 
V:  i 
R:  r 


Format: 


0190 1 | 4809 | | operands^ 
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OFFSET  Offset  an  Address  OFFSET 

Syntax: 

offset  I.R 

R  must  contain  a  memory  address  atom 

Operation: 

R  +  I  ->  R 
Description: 

The  sum  of  I  and  the  address  in  R  is  stored  in  R. 

Example: 

offset  20,r(0:0) 

Addressing  Modes: 

I:  a 
R:  r 

Format: 
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MOVE 


Move  a  Value 


MOVE 


Syntax: 

move  <eal>.<ea2> 
where: 

<ea>  must  be  one  of  the  addressing  modes  listed  below 

Operation: 

source  —  >  dest 

Description: 

The  value  found  at  the  source  address  is  copied  into  the  destination  address. 

Example: 

move  r(0:0),data 
move  {addr,data},r(0:20) 
move  {int,100},r(0:20)S 
move  r(0:20)@10,r(0:10) 

data:  ds  100 


Addressing  Modes: 

<eal>:  r,ri,rid,ridn.m,pcr,i 
<ea2>:  r,ri.rid,ridn,m,pcr 


Format: 

[ 0190  |  {  1  H815  l...[H83C  ]  }  |  operands" 


3^ 


PUSH 


Push  a  Value 


PUSH 


Syntax: 

push  <ea>.S 

where: 

<ea>  is  one  of  the  addressing  modes  listed  below 

Operation: 

source  — >  S 

Description: 

The  source  value  is  pushed  onto  stack  S.  The  programmer  has  no  access  to 
the  stack  pointer. 

Example: 

push  {int,100},s(0) 
push  r(0:10),s(l) 

Addressing  Modes: 

<ea>:  m,pcr,r.ri.rid.ridn.i 
S:  s 

Format: 

[ 01 90 1  {  |  H83D  1...[H'S43~1  }  [operands 
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UNLINK 


Unlink  and  Free 


UNLINK 


Syntax : 

unlink  R 

Operation: 

Rg  ->  R 

Description: 

The  value  in  the  base  address  of  the  segment  pointed  to  by  R  is  returned  in 

R.  The  segment  is  freed. 

Example: 

proc:  link  r  (0:5)  ,1 

move  r(0:5)2  <a4,r(0:0) 
add  {int,100}.r(0:0) 
move  r(0:0),r(0:5)2 §4 
unlink  r(0:5) 
rts 

Addressing  Modes: 

R:  r 

Format: 

[ 0 1 90  |  j  2897  j  operand  | 


148 


LINK 


Link  Frame  and  Allocate 


LINK 


Syntax. 

link  R.X 

Operation: 

Ra  ->  address© 

address  —  >  R 

Description: 

A  segment  of  N  cells  is  allocated  from  the  heap.  The  value  stored  in  R  is 

save  at  the  base  address  of  the  segment.  The  segment  base  address  is  returned  in 

R. 

This  instruction  is  designed  to  create  dynamic  links  for  local  environments. 

Example: 

proc:  link  r(0:5),l 

move  r(0:5)2@4,r(0:0) 
add  {int.l00},r(0:0) 
move  r(0:0),r(0:o)2'©4 
unlink  r(0:5) 
rts 

Above  is  an  example  of  uplevel  addressing. 

Addressing  Modes: 

R:  r 

N:  a 

Format: 

[0190]  [  3896 1  [operands  ~~| 
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-3- — F — : — r 


T C - J— - 


RTS 

Syntax: 

rts  S 

Operation: 

S  ~>  PC 


Return  from  Subroutine 


Description: 

Execution  resumes  at  the  address  popped  from  stack  S. 

Example: 

inrc:  add  {int,l},r(0:0) 

rts  s(0) 


Addressing  Modes: 

S:  s 

Format: 

1 019CT|  [2895  |  [operant! 


RTS 
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BSR 


Branch  Subroutine 


BSR 


Syntax: 

bsr  <ev>.S 

where: 

<ev>  is  one  of  the  addressing  modes  listed  below 

Operation: 

PC  ~>  S 

PC  +  <ev>  — >  PC 

Description: 

The  program  counter  is  pushed  onto  stack  S,  and  execution  resumes  at  the  9 

sum  of  the  program  counter  and  <ev>. 

Example: 

bsr  r(l:0),s(0) 

Addressing  Modes: 

<ev>:  r,a  S:  s 

Format: 

[01901  { 


].  }  [operands 
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JSR 


Jump  Subroutine 


JSR 


Syntax: 

jsr  <ea>,S 

where: 

<ea>  is  one  of  the  addressing  modes  listed  below 

Operation: 

PC  ~>  S 
<ea>  ->  PC 

Description: 

The  program  counter  is  pushed  onto  stack  S,  and  execution  resumes  at 
<ea>. 

Following  a  rorg  directive,  memory  absolute  is  converted  automatically  to 
program  counter  relative. 

Example: 

jsr  incr,s(0) 

Addressing  Modes: 

<ea> :  m,mi,r,pcr  S:  s 


Format: 


STOP  Halt  Execution  STOP 

Syntax: 

stop 

Operation: 


Description: 

Execution  is  terminated. 

Addressing  Modes: 


Format: 


IFTE 


If-Then-Else:  Conditional  Jump/Branch 


Syntax: 

if  R  <relop>  <ev>,Mx,My 
if  <bop>  <ea>,Mx,My 

where: 

< relop >  is  a  relational  operator 
<bop>  is  a  test  operator 

<ea>  and  <ev>  are  one  of  the  addressing  modes  listed  below 

Operation: 

if  R  <relop>  <ev>  then 
Mx  ->  PC 

else 

My  ~>  PC 

if  <bop>  <ea>  then 
Mx  ~>  PC 

else 

My  ~>  PC 


Description: 

If  the  comparison  is  true,  execution  resumes  at  Mx;  otherwise,  at  My. 


Example: 


if 

r(0:0)  >  r(0:l), easel, case2 

stuff: 

move 

r(0:0),data 

easel: 

jsr 

first  ,s(0) 

if 

int  r (0:0), easel 

stop 

case2: 

jsr 

second, s(0) 

stop 

Addressing  Modes: 
R:  r 

<ev>:  r,i 
<ea>:  r,m 
Mx:  m,pcr 
My:  m,pcr 


[F  If:  Conditional  Jump/Branch  IF 

Syntax: 

if  R  < relop >  <ev>,M 
if  <bop>  <ea>,M 

where: 

<relop>  is  a  relational  operator 
<bop>  is  a  test  operator 

<ea>  and  <ev>  are  one  of  the  addressing  modes  listed  below 

Operation: 

if  R  <relop>  <ev>  then 
M  ~>  PC 

if  <bop>  <ea>  then 
M  ->  PC 

Description: 

If  the  comparison  is  true,  execution  resumes  at  M;  otherwise,  with  the  next 


instruction. 

Example: 

move 

{int,10},r(0:0) 

loop: 

if 

r(0:0)  <  {int,l},done 

sub 

{int,l},r(0:0) 

jmp 

loop 

done: 

if 

int  data, loop 

data 

ds 

1 

Addressing  Modes: 
R:  r 

<ev> :  r,i 
<ea>:  r,m 
M:  m,pcr 

Format: 


586CP 

4874 

lE&i 

E)\ 

BRA 


Branch 


BRA 


Syntax: 

bra  <ev> 

where: 

<ev>  is  one  of  the  addressing  modes  listed  below 

Operation: 

PC  +  <ev>  ~>  PC 

Description: 

Execution  resumes  at  the  sum  of  the  program  counter  and  the  effective  value. 


Example: 


bra  100 


Addressing  Modes: 

<ev>:  a,r 


Format: 


JMP 


Jump 


JMP 


Syntax: 

jmp  <ea> 

where: 

<ea>  is  one  of  the  addressing  modes  listed  below 

Operation: 

<ea>  ~>  PC 

Description: 

Execution  resumes  at  <ea>. 

If  jmp  follows  a  rorg  directive,  a  jump  to  memory  absolute  is  converted  to  a 
branch. 

Example: 

jmp  here 
jmp  r(0:0) 

here:  jmp  (l:150)@ 

Addressing  Modes: 

<ea>:  m,r,mi,pcr 


Format: 


POPX 


Remove  the  Top  of  a  Stack 


Syntax: 
popx  S 

Operation: 

S  ~> 

Description: 

The  top  value  of  stack  S  is  removed. 

It  is  an  error  to  attempt  to  remove  the  top  of  an  empty  stack. 

Example: 

pop<  s(0) 

Addressing  Modes: 

S:  s 


Format: 


POP 


Pop  a  Value 


POP 


Syntax: 

pop  S,<ea> 

where: 

<ea>  is  one  of  the  addressing  modes  listed  below 

Operation: 

S  —  >  dest 

Description: 

The  source  value  is  popped  off  stack  S  and  stored  at  <ea>. 
has  no  access  to  the  stack  pointer. 

It  is  an  error  to  attempt  to  pop  a  value  from  an  empty  stack 

Example: 

pop  s(0),r(0:l) 
pop  s(0),data 

data:  ds  1 

Addressing  Modes: 

S:  s 

<ea>:  m,pcr,r,ri,rid,ridn 

Format: 


The  programmer 


[Q190~|  {  pH844~l-..rR849~l  }  | operands  | 
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